PAPER PLAINE

Fresh research, simply explained. Updates twice daily.

EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Generation

Testing AI's ability to keep characters consistent across long video sequences

Researchers built EntityBench, a standardized test for video-generation AI that measures whether systems can keep the same characters, objects, and locations consistent across long sequences of shots. The test, based on real TV episodes, reveals that existing systems struggle dramatically when characters reappear after long gaps, and a new memory-based approach (EntityMem) achieved significantly better character consistency than existing methods.

Generating coherent multi-scene videos is a step toward AI that can create longer, more complex visual stories — from TV-like narratives to advertisements and filmmaking. Right now, when a character disappears from frame for several minutes then reappears, AI systems often render them looking completely different, breaking the viewer's experience. EntityBench gives researchers a concrete way to measure and improve this problem, accelerating progress toward AI that can maintain visual continuity over extended sequences.