PAPER PLAINE

Fresh research, simply explained. Updates twice daily.

KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Removing unwanted information from AI's memory without reprocessing everything

When large language models process long documents, information gets cached for speed—but sometimes that information becomes irrelevant or harmful after processing starts. KVEraser, a new technique, removes specific spans of cached information by replacing only their memory traces with learned alternatives, rather than forcing the system to reprocess thousands of subsequent tokens. On documents up to 32,000 tokens long, it achieves nearly the same accuracy as full recomputation while being 7 times faster.

Long-context AI applications frequently encounter stale search results, incorrect tool outputs, or harmful injected content that only become apparent mid-processing. KVEraser enables real-time removal of this bad information without the computational penalty that would otherwise make it impractical—turning a 17.6x slowdown into just a 24% one. This makes it feasible to build AI systems that can correct themselves and respond safely to new user instructions mid-conversation.

Simulation-Based Multi-Fillet Evaluation of Woody Breast Poultry Fillets

Spotting diseased chicken meat by watching multiple fillets bend at once

Chicken breast disease called woody breast makes meat tough and worthless, but current detection systems only scan one fillet at a time, slowing down processing plants. Researchers created a physics-based computer simulation and a new camera angle that can evaluate multiple fillets simultaneously by tracking how they bend, offering a faster alternative to the existing method.

Poultry processing plants lose millions annually to woody breast going undetected. A system that evaluates several fillets at once instead of one could speed up quality control lines while catching more diseased meat before it reaches consumers. This approach could help producers reduce waste and improve food safety without expensive equipment overhauls.

Revisiting Trade-sign Long-memory and Square-root Law price impact

Why large trades leave predictable price fingerprints in financial markets

When traders execute large orders, markets exhibit two well-known patterns: past trade directions predict future ones (long-memory), and price impact grows with the square root of order size rather than linearly. This paper derives both patterns from a single mathematical framework based on how buy and sell orders pile up in the market, showing that the long-memory effect is really about timing of trades, while the square-root law reflects the market's actual survival and stability.

Large institutional investors rely on these patterns to predict how much a trade will move the market and to design execution strategies that minimize costs. Clarifying exactly why these patterns emerge—and distinguishing between patterns that depend on how often trades happen versus how many shares move—helps traders and risk managers build more accurate models of real market behavior and avoid costly surprises when market conditions shift.

When in Doubt, Plan It Out: Committed Small Language Model Deliberation for Reactive Reinforcement Learning

Pairing quick AI reflexes with slow, careful thinking for better decisions

A hybrid system called PACT combines a fast, instinctive AI policy with a small language model that stops to think and plan. When the AI encounters unfamiliar situations, it calls on the language model to generate and test action plans before committing to them, dramatically outperforming either approach alone on difficult navigation tasks.

AI systems deployed in the real world—robots, autonomous vehicles, safety-critical systems—often fail when they encounter situations they weren't trained on. PACT shows that adding a deliberative planning step can catch and prevent these failures without retraining the core system, making existing AI safer and more reliable when conditions change unexpectedly.

Circuit Tracing in Autoregressive Protein Language Models

Decoding how AI models generate new protein sequences

Researchers created ProGenMech, a new tool to reverse-engineer how protein-generating AI models work. By tracing the computational pathways through these models, they discovered that the systems identify sparse, meaningful patterns—like conserved sequence motifs—that guide protein generation and predict protein quality, revealing that the AI learns recognizable biological logic rather than just statistical shortcuts.

Protein generation AI could accelerate drug discovery and enzyme design, but scientists can only trust these models once they understand what the AI is actually doing. By making these models interpretable, researchers can verify the generated proteins follow real biological principles, catch failures before expensive lab testing, and potentially steer the AI toward specific desired properties—turning black-box generation into a tool biologists can actually use.

Bath memory as a precision resource in quantum transport

Using quantum bath memory to squeeze more precision from atomic-scale devices

Physicists have identified how to harness the quantum environment surrounding tiny conductors to reduce noise and boost measurement precision. The key is tuning the bandwidth of this environment to create synchronized interference patterns in electron flow, allowing devices to achieve better precision than systems without this engineered memory effect.

Quantum dots and other nanoscale devices are candidates for ultra-precise sensors and quantum computers, but noise from their surroundings degrades performance. This work provides experimentalists with a concrete, measurable target—the minimum current noise point—that tells them when their device is operating at peak precision, making it practical to build better quantum technologies.

Persona-Pruner: Sculpting Lightweight Models for Role-Playing

Shrinking AI chatbots without losing their personality or ability to act like specific characters

A new method called Persona-Pruner can strip away unnecessary parts of large language models while keeping the specific personality traits needed for a single character role. When tested, it preserved 93.8% more of the original performance compared to standard pruning techniques, creating lightweight models that still sound and act like their intended persona.

Video games, virtual assistants, and interactive storytelling platforms often need dozens or hundreds of distinct NPC characters running simultaneously. Current AI chatbots require running a full, massive model for each character, which is computationally expensive and slow. Persona-Pruner makes each character's AI 5–10 times smaller without noticeable degradation, which means more characters can run at once on cheaper hardware, making complex interactive worlds actually affordable to build and operate.

A Statistical and Machine Learning Framework for Operational Threshold Detection and Deployable Dispatch Controller Development in Hydrogen Multi-Energy Systems

Machine learning reveals hidden patterns in hydrogen energy systems

Researchers analyzed a year of real operating data from a hydrogen energy system and found that solar power alone explains nearly half of hydrogen production variation—but wind's importance only became visible when they switched from traditional statistics to machine learning methods. This revealed that wind affects hydrogen production in complex, non-linear ways that simple correlation measures completely miss, suggesting that solar and wind interact in ways traditional analysis can't detect.

Hydrogen systems are being built now as part of the shift to renewable energy, but operators don't yet know how to run them efficiently. This framework provides a practical toolkit for predicting when to make hydrogen and when to sell it back to the grid, potentially reducing waste and improving revenue. The finding that machine learning uncovers real dynamics hidden from traditional statistics means energy operators need both approaches working together to actually optimize these systems.

Wealth Inequality and Planetary Boundaries in a Stylized Agent-Based Model

Why rich countries stay trapped burning fossil fuels despite knowing better

A computer simulation of economic decisions reveals a vicious cycle: wealthy people and nations feel insulated from climate disasters, so they invest less in clean energy, which slows the transition away from fossil fuels even when most people care about the environment. The model shows this trap persists in wealth-inequality levels matching today's developed countries—and that carbon taxes or green subsidies only work if they're paired with policies that reduce inequality itself.

Policymakers trying to accelerate the shift to renewable energy often assume the main barriers are technological or financial. This research suggests inequality itself is the lock. It implies that climate plans which ignore wealth distribution—taxing the rich heavily without redistributing gains—will fail or move glacially. Countries may need to combine green investment with income redistribution, not choose between them.

From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing

Making voice-cloning detection work against new fake-speech techniques

Researchers upgraded a speech-analysis AI system using a technique called Mixture-of-Experts, which lets multiple specialized neural networks work together to catch synthetic voices. The system reduced errors by 12% when tested against 14 different datasets of spoofed audio, and crucially, it maintained its ability to detect new types of fake speech it had never encountered before.

Voice-based authentication is increasingly used for banking, phone systems, and security—making reliable detection of deepfake audio critical. As AI-generated speech becomes more convincing, anti-spoofing systems that fail on novel synthesis methods create real security gaps. This approach offers measurably better detection across diverse generation techniques, meaning voice-based systems can defend against both current and emerging deepfake threats.

Federated Learning for Feature Generalization with Convex Constraints

Helping distributed AI systems learn shared skills without overfitting to local data

When machine learning models train across multiple devices with different data, they often overfit to their local information and lose the ability to generalize. Researchers developed FedCONST, which automatically adjusts how much each device's updates influence the shared model, ensuring that well-learned features don't drown out weaker ones during the merging process.

Federated learning powers real-world systems like predictive keyboards, health apps, and industrial sensors that must learn from private data without sending it to a central server. Better generalization means these systems work reliably when deployed to new users or environments, rather than degrading because they memorized quirks of their training group. This directly improves the practical performance of privacy-preserving AI across smartphones, hospitals, and distributed networks.

CFOs Meet LLMs

Can AI predict what business leaders actually think about the economy?

Researchers prompted an AI language model to role-play as CFOs of real companies and answer questions about economic optimism. The AI's answers matched what those CFOs actually said in surveys with striking accuracy, even after accounting for the companies' past responses and characteristics. This suggests LLMs could replace expensive, slow-to-conduct surveys with instant, continuous snapshots of business sentiment across thousands of firms.

Business leaders' economic outlook drives hiring, investment, and lending decisions that ripple through the entire economy. Currently, policymakers and investors rely on surveys of just a few hundred CFOs that arrive months late. If AI can reliably predict what executives are thinking in real time, economists and the Federal Reserve could spot economic shifts weeks or months earlier and adjust policy accordingly—potentially catching slowdowns before they happen or avoiding overheating.