On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

Computer Science May 7, 2026

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

How shuffling AI model outputs doesn't actually hide them from hackers

Zhengyi Li, Yakai Wang, Kang Yang et al.
arXiv:2605.04901

Summary

A security technique meant to protect AI models during remote computation—shuffling the model's internal activations before revealing them—can be broken for about $1 worth of queries. Researchers show how to align these shuffled values back to their original order, then use them to recover the model's actual weights, demonstrating the attack works on real models like GPT-2.

Why it matters

As AI systems move to cloud computing, companies rely on cryptographic defenses to keep model weights secret while still computing results. This attack shows a widely-used shuffling defense provides a false sense of security—meaning companies using it may think their models are protected when they're actually vulnerable to cheap theft. Developers now need better defenses before deploying sensitive models to untrusted servers.

Read on arXiv Posted on arXiv · May 6, 2026