PAPER PLAINE

Fresh research, simply explained. Updates twice daily.

KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Removing unwanted information from AI's memory without reprocessing everything

When large language models process long documents, information gets cached for speed—but sometimes that information becomes irrelevant or harmful after processing starts. KVEraser, a new technique, removes specific spans of cached information by replacing only their memory traces with learned alternatives, rather than forcing the system to reprocess thousands of subsequent tokens. On documents up to 32,000 tokens long, it achieves nearly the same accuracy as full recomputation while being 7 times faster.

Long-context AI applications frequently encounter stale search results, incorrect tool outputs, or harmful injected content that only become apparent mid-processing. KVEraser enables real-time removal of this bad information without the computational penalty that would otherwise make it impractical—turning a 17.6x slowdown into just a 24% one. This makes it feasible to build AI systems that can correct themselves and respond safely to new user instructions mid-conversation.