TriViewBench: Controlled Complexity Scaling for Multi-View Structural Reasoning in MLLMs

Computer Science · AI Jun 25, 2026

TriViewBench: Controlled Complexity Scaling for Multi-View Structural Reasoning in MLLMs

Why AI vision systems fail when objects hide and multiply across views

Yu-Yang Chen, Lan-Zhe Guo
arXiv:2606.26029

Summary

All 18 major AI vision systems tested share the same weakness: they handle simple visual questions well but collapse catastrophically when asked to count objects (59% accuracy drop) or understand complex 3D scenes (80% drop). The failures stem from two distinct problems—the systems either miss hidden objects or confuse the same object across different camera angles—and simply asking them to "think step by step" doesn't help.

Why it matters

AI systems that see are being deployed in robotics, autonomous vehicles, and industrial inspection, where missing hidden objects or misidentifying items across viewpoints could cause real failures. This benchmark reveals these systems have a fundamental blind spot that current prompting tricks can't fix, suggesting engineers need to rebuild how these systems represent 3D space rather than just improve their reasoning.

Read on arXiv Posted on arXiv · Jun 24, 2026