KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Computer Science · AI Jun 16, 2026

KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Removing unwanted information from AI's memory without reprocessing everything

Mufei Li, Shikun Liu, Dongqi Fu et al.
arXiv:2606.17034

Summary

When large language models process long documents, information gets cached for speed—but sometimes that information becomes irrelevant or harmful after processing starts. KVEraser, a new technique, removes specific spans of cached information by replacing only their memory traces with learned alternatives, rather than forcing the system to reprocess thousands of subsequent tokens. On documents up to 32,000 tokens long, it achieves nearly the same accuracy as full recomputation while being 7 times faster.

Why it matters

Long-context AI applications frequently encounter stale search results, incorrect tool outputs, or harmful injected content that only become apparent mid-processing. KVEraser enables real-time removal of this bad information without the computational penalty that would otherwise make it impractical—turning a 17.6x slowdown into just a 24% one. This makes it feasible to build AI systems that can correct themselves and respond safely to new user instructions mid-conversation.

Read on arXiv Posted on arXiv · Jun 15, 2026