Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

Computer Science · AI Jun 2, 2026

Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

Teaching AI judges to trust their eyes over plausible-sounding lies

Seojeong Park, Jiho Choi, Junyong Kang et al.
arXiv:2606.02578

Summary

Multimodal AI systems trained to evaluate images and text tend to believe convincing written descriptions even when the images say otherwise. Researchers created a new training dataset with carefully tweaked image-text pairs that expose these perceptual blind spots, then used it to retrain evaluation models. The retrained systems now consistently prioritize what they actually see over what sounds reasonable.

Why it matters

AI judges are increasingly used to rank model outputs in real-world applications—from content moderation to scientific image analysis. If these systems can be fooled by false narratives that contradict visual evidence, they produce unreliable scores that spread errors downstream. This work makes evaluators more trustworthy by forcing them to ground their judgments in actual perception rather than text plausibility.

Read on arXiv Posted on arXiv · Jun 1, 2026