Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge

Statistics May 12, 2026

Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge

When should AI judges actually think through their decisions?

Wenbo Zhang, Lijinghua Zhang, Liner Xiang et al.
arXiv:2605.10805

Summary

Reasoning-capable AI judges dramatically improve accuracy on complex tasks like math and code verification, but waste computation on simpler evaluations—suggesting they should be deployed selectively, not everywhere. Researchers developed RACER, a system that automatically routes tasks to either reasoning or fast judges based on difficulty and cost, maintaining accuracy while staying within a fixed computing budget even when task types shift unexpectedly.

Why it matters

AI-as-a-judge systems are increasingly used to automatically grade student work, evaluate code, and validate outputs in production systems. Making these systems smarter about when to engage expensive reasoning directly cuts computational waste while maintaining accuracy—crucial for companies running these evaluations at scale where every percentage point of wasted compute multiplies across millions of judgments.

Read on arXiv Posted on arXiv · May 11, 2026