LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Computer Science · AI May 25, 2026

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Why making AI models bigger sometimes makes them worse

Xu Ouyang, Deyi Liu, Yuhang Cai et al.
arXiv:2605.23901

Summary

Large language models stop improving and sometimes get worse when you scale them up without careful balance—much like how adding noise to a radio signal eventually drowns out the message. Researchers applied Shannon's information theory, which originally explained how much data can travel reliably through noisy communication channels, to model training and found it predicts this counterintuitive breakdown far better than existing scaling laws.

Why it matters

Teams building AI models currently spend billions scaling up compute and data assuming bigger always means better. This framework shows there's a ceiling—a signal-to-noise ratio threshold—beyond which throwing more resources at training actually degrades performance. The predictions hold up across different model sizes and perturbations, which means practitioners can now estimate where that threshold lies before wasting compute, and researchers have a principled way to understand when and why scaling strategies fail.

Read on arXiv Posted on arXiv · May 22, 2026