Archive, Page 10 — Paper Plaine

Statistics Jun 17, 2026

Tensor-based second-order causal discovery

Finding cause-and-effect relationships by analyzing how variables respond to changes

Nathan Ouyang, Kexin Wan, Anna Seigal
arXiv:2606.18074

Summary

A new algorithm called TSCD can uncover which variables cause which others by analyzing data from experiments where researchers deliberately change one thing at a time. The method works with far fewer experiments than you'd expect—only needing a number proportional to the logarithm of total variables—and handles both linear and nonlinear relationships without requiring the data to be normally distributed.

Why it matters

Identifying true causes rather than just correlations is essential in fields from medicine to economics, where treating a symptom won't help if you don't know what causes it. TSCD's ability to work with fewer experiments saves time and resources, while its efficiency means it can handle systems with hundreds of variables—making it practical for real-world problems like understanding gene networks or economic supply chains.

Read on arXiv Posted on arXiv · Jun 16, 2026

Computer Science · AI Jun 16, 2026

KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Removing unwanted information from AI's memory without reprocessing everything

Mufei Li, Shikun Liu, Dongqi Fu et al.
arXiv:2606.17034

Summary

When large language models process long documents, information gets cached for speed—but sometimes that information becomes irrelevant or harmful after processing starts. KVEraser, a new technique, removes specific spans of cached information by replacing only their memory traces with learned alternatives, rather than forcing the system to reprocess thousands of subsequent tokens. On documents up to 32,000 tokens long, it achieves nearly the same accuracy as full recomputation while being 7 times faster.

Why it matters

Long-context AI applications frequently encounter stale search results, incorrect tool outputs, or harmful injected content that only become apparent mid-processing. KVEraser enables real-time removal of this bad information without the computational penalty that would otherwise make it impractical—turning a 17.6x slowdown into just a 24% one. This makes it feasible to build AI systems that can correct themselves and respond safely to new user instructions mid-conversation.

Read on arXiv Posted on arXiv · Jun 15, 2026

Engineering Jun 16, 2026

Simulation-Based Multi-Fillet Evaluation of Woody Breast Poultry Fillets

Spotting diseased chicken meat by watching multiple fillets bend at once

Chirantan Sen Mukherjee, Seung-Chul Yoon, William J. Beksi
arXiv:2606.16951

Summary

Chicken breast disease called woody breast makes meat tough and worthless, but current detection systems only scan one fillet at a time, slowing down processing plants. Researchers created a physics-based computer simulation and a new camera angle that can evaluate multiple fillets simultaneously by tracking how they bend, offering a faster alternative to the existing method.

Why it matters

Poultry processing plants lose millions annually to woody breast going undetected. A system that evaluates several fillets at once instead of one could speed up quality control lines while catching more diseased meat before it reaches consumers. This approach could help producers reduce waste and improve food safety without expensive equipment overhauls.

Read on arXiv Posted on arXiv · Jun 15, 2026

Quantitative Finance Jun 16, 2026

Revisiting Trade-sign Long-memory and Square-root Law price impact

Why large trades leave predictable price fingerprints in financial markets

Chris Angstmann, Tim Gebbie
arXiv:2606.16269

Summary

When traders execute large orders, markets exhibit two well-known patterns: past trade directions predict future ones (long-memory), and price impact grows with the square root of order size rather than linearly. This paper derives both patterns from a single mathematical framework based on how buy and sell orders pile up in the market, showing that the long-memory effect is really about timing of trades, while the square-root law reflects the market's actual survival and stability.

Why it matters

Large institutional investors rely on these patterns to predict how much a trade will move the market and to design execution strategies that minimize costs. Clarifying exactly why these patterns emerge—and distinguishing between patterns that depend on how often trades happen versus how many shares move—helps traders and risk managers build more accurate models of real market behavior and avoid costly surprises when market conditions shift.

Read on arXiv Posted on arXiv · Jun 15, 2026

Computer Science · AI Jun 16, 2026

When in Doubt, Plan It Out: Committed Small Language Model Deliberation for Reactive Reinforcement Learning

Pairing quick AI reflexes with slow, careful thinking for better decisions

Nathan Gavenski, Juarez Monteiro, Francisco Galuppo et al.
arXiv:2606.16995

Summary

A hybrid system called PACT combines a fast, instinctive AI policy with a small language model that stops to think and plan. When the AI encounters unfamiliar situations, it calls on the language model to generate and test action plans before committing to them, dramatically outperforming either approach alone on difficult navigation tasks.

Why it matters

AI systems deployed in the real world—robots, autonomous vehicles, safety-critical systems—often fail when they encounter situations they weren't trained on. PACT shows that adding a deliberative planning step can catch and prevent these failures without retraining the core system, making existing AI safer and more reliable when conditions change unexpectedly.

Read on arXiv Posted on arXiv · Jun 15, 2026

Quantitative Biology Jun 16, 2026

Circuit Tracing in Autoregressive Protein Language Models

Decoding how AI models generate new protein sequences

Darin Tsui, William Deinzer, Daniel Saeedi et al.
arXiv:2606.16044

Summary

Researchers created ProGenMech, a new tool to reverse-engineer how protein-generating AI models work. By tracing the computational pathways through these models, they discovered that the systems identify sparse, meaningful patterns—like conserved sequence motifs—that guide protein generation and predict protein quality, revealing that the AI learns recognizable biological logic rather than just statistical shortcuts.

Why it matters

Protein generation AI could accelerate drug discovery and enzyme design, but scientists can only trust these models once they understand what the AI is actually doing. By making these models interpretable, researchers can verify the generated proteins follow real biological principles, catch failures before expensive lab testing, and potentially steer the AI toward specific desired properties—turning black-box generation into a tool biologists can actually use.

Read on arXiv Posted on arXiv · Jun 14, 2026

Physics Jun 16, 2026

Bath memory as a precision resource in quantum transport

Using quantum bath memory to squeeze more precision from atomic-scale devices

José Molina, Sheikh Parvez Mandal, Mahasweta Pandit et al.
arXiv:2606.17026

Summary

Physicists have identified how to harness the quantum environment surrounding tiny conductors to reduce noise and boost measurement precision. The key is tuning the bandwidth of this environment to create synchronized interference patterns in electron flow, allowing devices to achieve better precision than systems without this engineered memory effect.

Why it matters

Quantum dots and other nanoscale devices are candidates for ultra-precise sensors and quantum computers, but noise from their surroundings degrades performance. This work provides experimentalists with a concrete, measurable target—the minimum current noise point—that tells them when their device is operating at peak precision, making it practical to build better quantum technologies.

Read on arXiv Posted on arXiv · Jun 15, 2026

Computer Science · AI Jun 15, 2026

Persona-Pruner: Sculpting Lightweight Models for Role-Playing

Shrinking AI chatbots without losing their personality or ability to act like specific characters

Jinsu Kim, Jihoon Tack, Noah Lee et al.
arXiv:2606.14695

Summary

A new method called Persona-Pruner can strip away unnecessary parts of large language models while keeping the specific personality traits needed for a single character role. When tested, it preserved 93.8% more of the original performance compared to standard pruning techniques, creating lightweight models that still sound and act like their intended persona.

Why it matters

Video games, virtual assistants, and interactive storytelling platforms often need dozens or hundreds of distinct NPC characters running simultaneously. Current AI chatbots require running a full, massive model for each character, which is computationally expensive and slow. Persona-Pruner makes each character's AI 5–10 times smaller without noticeable degradation, which means more characters can run at once on cheaper hardware, making complex interactive worlds actually affordable to build and operate.

Read on arXiv Posted on arXiv · Jun 12, 2026

Mathematics Jun 15, 2026

A Statistical and Machine Learning Framework for Operational Threshold Detection and Deployable Dispatch Controller Development in Hydrogen Multi-Energy Systems

Machine learning reveals hidden patterns in hydrogen energy systems

Shadi Heenatigala, Hasanika Samarasinghe
arXiv:2606.14601

Summary

Researchers analyzed a year of real operating data from a hydrogen energy system and found that solar power alone explains nearly half of hydrogen production variation—but wind's importance only became visible when they switched from traditional statistics to machine learning methods. This revealed that wind affects hydrogen production in complex, non-linear ways that simple correlation measures completely miss, suggesting that solar and wind interact in ways traditional analysis can't detect.

Why it matters

Hydrogen systems are being built now as part of the shift to renewable energy, but operators don't yet know how to run them efficiently. This framework provides a practical toolkit for predicting when to make hydrogen and when to sell it back to the grid, potentially reducing waste and improving revenue. The finding that machine learning uncovers real dynamics hidden from traditional statistics means energy operators need both approaches working together to actually optimize these systems.

Read on arXiv Posted on arXiv · Jun 12, 2026

Economics Jun 15, 2026

Wealth Inequality and Planetary Boundaries in a Stylized Agent-Based Model

Why rich countries stay trapped burning fossil fuels despite knowing better

Thomas Valade, Michael Benzaquen, Matthieu Cristelli et al.
arXiv:2606.14331

Summary

A computer simulation of economic decisions reveals a vicious cycle: wealthy people and nations feel insulated from climate disasters, so they invest less in clean energy, which slows the transition away from fossil fuels even when most people care about the environment. The model shows this trap persists in wealth-inequality levels matching today's developed countries—and that carbon taxes or green subsidies only work if they're paired with policies that reduce inequality itself.

Why it matters

Policymakers trying to accelerate the shift to renewable energy often assume the main barriers are technological or financial. This research suggests inequality itself is the lock. It implies that climate plans which ignore wealth distribution—taxing the rich heavily without redistributing gains—will fail or move glacially. Countries may need to combine green investment with income redistribution, not choose between them.

Read on arXiv Posted on arXiv · Jun 12, 2026

Computer Science · AI Jun 15, 2026

From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing

Making voice-cloning detection work against new fake-speech techniques

Hugo Daumain, Driss Matrouf, Khaled Khelif et al.
arXiv:2606.14639

Summary

Researchers upgraded a speech-analysis AI system using a technique called Mixture-of-Experts, which lets multiple specialized neural networks work together to catch synthetic voices. The system reduced errors by 12% when tested against 14 different datasets of spoofed audio, and crucially, it maintained its ability to detect new types of fake speech it had never encountered before.

Why it matters

Voice-based authentication is increasingly used for banking, phone systems, and security—making reliable detection of deepfake audio critical. As AI-generated speech becomes more convincing, anti-spoofing systems that fail on novel synthesis methods create real security gaps. This approach offers measurably better detection across diverse generation techniques, meaning voice-based systems can defend against both current and emerging deepfake threats.

Read on arXiv Posted on arXiv · Jun 12, 2026

Statistics Jun 15, 2026

Federated Learning for Feature Generalization with Convex Constraints

Helping distributed AI systems learn shared skills without overfitting to local data

Dongwon Kim, Donghee Kim, Sung Kuk Shyn et al.
arXiv:2606.14416

Summary

When machine learning models train across multiple devices with different data, they often overfit to their local information and lose the ability to generalize. Researchers developed FedCONST, which automatically adjusts how much each device's updates influence the shared model, ensuring that well-learned features don't drown out weaker ones during the merging process.

Why it matters

Federated learning powers real-world systems like predictive keyboards, health apps, and industrial sensors that must learn from private data without sending it to a central server. Better generalization means these systems work reliably when deployed to new users or environments, rather than degrading because they memorized quirks of their training group. This directly improves the practical performance of privacy-preserving AI across smartphones, hospitals, and distributed networks.

Read on arXiv Posted on arXiv · Jun 12, 2026