AI Research — Paper Plaine

Computer Science · AI May 10, 2026

UniPool: A Globally Shared Expert Pool for Mixture-of-Experts

Sharing expert capacity across layers instead of duplicating it per layer

Minbin Huang, Han Shi, Chuanyang Zheng et al.
arXiv:2605.06665

Summary

A new design for mixture-of-experts neural networks treats expert capacity as a shared resource rather than giving each layer its own separate experts. Across five model sizes, this approach reduces validation loss by up to 3.86% and matches the performance of traditional designs while using only 42–67% as many expert parameters, suggesting that experts don't need to multiply linearly as models get deeper.

Why it matters

Current large language models waste capacity by requiring each layer to have its own set of experts, forcing model size to balloon as networks grow deeper. This work shows you can build more efficient models by pooling experts globally, which directly reduces the computational and memory cost of training and running massive AI systems.

Read on arXiv Posted on arXiv · May 7, 2026

Computer Science · AI May 9, 2026

ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation

Controlling both actor movement and camera angles in AI-generated videos

Omar El Khalifi, Thomas Rossi, Oscar Fossey et al.
arXiv:2605.06667

Summary

A new method called ActCam lets filmmakers generate videos where they control both how an actor moves and where the camera points—without needing to train a custom AI model. By carefully layering pose and depth information at different stages of video generation, the system maintains geometric consistency and produces results that human raters prefer, especially when the camera makes large jumps to new angles.

Why it matters

Video production typically requires either expensive motion capture setups or manual frame-by-frame editing to coordinate actor movement with camera work. ActCam works with existing AI video generators and requires no retraining, making professional-looking camera control accessible to independent filmmakers and artists who lack studio resources.

Read on arXiv Posted on arXiv · May 7, 2026

Computer Science · AI May 9, 2026

StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction

Teaching AI agents to plan ahead instead of just reacting moment-to-moment

Xiangyuan Xue, Yifan Zhou, Zidong Wang et al.
arXiv:2605.06642

Summary

A new training method called StraTA helps large language models work better as decision-making agents by having them sketch out a high-level strategy before taking action. On three real-world task environments, the approach achieved success rates above 93% on some benchmarks and needed fewer training examples than existing methods.

Why it matters

Current AI agents struggle with long chains of decisions because they react to each step without a plan, making them inefficient and error-prone. StraTA's strategy-first approach could improve AI assistants that handle complex real-world tasks like shopping, research, or household management—reducing the computing power and training data needed to get them working reliably.

Read on arXiv Posted on arXiv · May 7, 2026

Computer Science · AI May 8, 2026

MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems

Automatically tuning instructions for AI teams that work together

Zhexuan Wang, Xuebo Liu, Li Wang et al.
arXiv:2605.06623

Summary

When multiple AI agents work together on a task, their individual instructions (prompts) need to work well not just in isolation, but as a coordinated system. A new framework called MASPO automatically improves these prompts by testing how well each agent's output helps the next agent succeed, rather than optimizing each agent separately. Tests across six different tasks show this approach outperforms existing methods by an average of 2.9 percentage points.

Why it matters

As companies deploy multi-agent AI systems for complex work, getting these systems to actually cooperate effectively has been a major bottleneck—manually writing and tuning prompts for each agent is slow and often produces suboptimal teamwork. MASPO makes this process automatic and more effective, which could accelerate real-world deployment of AI systems handling tasks like research, customer service, or software development that require coordinated reasoning across multiple specialized agents.

Read on arXiv Posted on arXiv · May 7, 2026

Computer Science · AI May 8, 2026

BAMI: Training-Free Bias Mitigation in GUI Grounding

Fixing AI agents that struggle to click the right button on complex screens

Borui Zhang, Bo Zhang, Bo Wang et al.
arXiv:2605.06664

Summary

AI systems that automate computer tasks often fail when screens are high-resolution or crowded with interface elements. A new technique called BAMI improves accuracy without requiring retraining—boosting one model's performance on a challenging benchmark from 52% to 58%—by breaking down the task into simpler steps and filtering out confusing options.

Why it matters

As companies automate more customer service, data entry, and software testing with AI agents, these systems need to reliably click and interact with real websites and applications. This method works with existing AI models off-the-shelf, making it immediately useful for improving the accuracy of automation tools without the expense and time of rebuilding them from scratch.

Read on arXiv Posted on arXiv · May 7, 2026

Computer Science · AI May 7, 2026

Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting

Why transformers for time series don't need complex hidden patterns

Alper Yıldırım
arXiv:2605.05151

Summary

Transformers work well for predicting time series, but researchers wanted to understand how—specifically whether they use the same clever internal trick (called superposition) that makes them powerful for language. By examining a transformer trained on forecasting, they found transformers actually keep things simple: they don't compress multiple patterns into the same neurons, and they ignore most of their hidden layers when making predictions. This helps explain why straightforward linear models stay competitive with far more complex transformer models.

Why it matters

Companies spend millions deploying expensive transformer models for forecasting tasks when simpler, cheaper alternatives work nearly as well. Understanding that transformers aren't actually using sophisticated compositional tricks on time series means practitioners can stop assuming complexity equals better performance and instead choose based on speed, cost, and actual accuracy on their specific problem. This could shift forecasting systems toward simpler, more interpretable models without sacrificing results.

Read on arXiv Posted on arXiv · May 6, 2026

Computer Science · AI May 7, 2026

Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models

Automatically discovering hidden side effects when tweaking AI language models

Quintin Pope, Ajay Hayagreeve Balaji, Jacques Thibodeau et al.
arXiv:2605.05090

Summary

Researchers built an automated system that compares how a language model behaves before and after an intervention—like when engineers try to make it forget certain information or reason better—and generates human-readable descriptions of what changed. Testing on three real interventions (reasoning training, knowledge editing, and unlearning), the system caught both intended changes and unexpected behavioral shifts that engineers hadn't anticipated.

Why it matters

AI companies make constant changes to their language models, but it's extremely difficult to know all the ways those changes affect behavior beyond the intended goal. This tool lets engineers systematically audit what else changed, catching surprises before models are deployed. That's critical for safety: a fix intended to make a model more helpful might accidentally make it worse at something else, and discovering that requires more than checking the intended behavior.

Read on arXiv Posted on arXiv · May 6, 2026

Computer Science · AI May 6, 2026

Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes

Teaching AI to sample from mathematical functions without wasting computation

Aaron Havens, Brian Karrer, Neta Shaul
arXiv:2605.03984

Summary

Researchers developed Flow Sampling, a method that lets AI systems efficiently generate samples from complex mathematical distributions defined by energy functions—without needing actual data to learn from. The technique cuts down how many times the expensive energy function must be evaluated during training, and works not just in ordinary space but also on curved mathematical surfaces like spheres and hyperbolic geometries.

Why it matters

Many real problems in physics, chemistry, and statistics require sampling from distributions where you know the underlying energy function but can't directly sample from it. This method makes that process far cheaper computationally, opening the door to faster simulations of molecular structures, protein folding, and other complex systems where brute-force sampling would be prohibitively expensive.

Read on arXiv Posted on arXiv · May 5, 2026

Computer Science · AI May 6, 2026

Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators

Making AI-text detectors work reliably across different sources and writing styles

Mohamed Mady, Johannes Reschke, Björn Schuller
arXiv:2605.03969

Summary

Detectors trained to spot AI-generated text perform near-perfectly on familiar material but fail badly when encountering text from new sources or generators—a problem researchers call brittleness. Adding linguistic features like readability and vocabulary patterns to a transformer model improved performance across different domains, pushing balanced accuracy from around 60% to 86% when tested on unfamiliar text.

Why it matters

As AI systems generate text at scale across the internet, platforms need detectors that actually work in the real world, not just in controlled testing. This research shows that simple feature engineering can make detectors three times more reliable when encountering new types of AI generators, making them practically useful for content moderation and detection systems that can't be retrained constantly.

Read on arXiv Posted on arXiv · May 5, 2026

Computer Science · AI May 5, 2026

SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection

Speeding up AI by automatically adjusting how many words to guess ahead

Shikhar Shukla
arXiv:2605.02888

Summary

A new system called SpecKV automatically tunes how many tokens a small AI model should propose at each step during the verification process that speeds up large language models. By reading signals from the draft model itself—like how confident it is in its guesses—SpecKV picks the best number of proposals for each moment, delivering 56% faster results than the current fixed approach with almost no added slowdown.

Why it matters

Large language models power chatbots, search, and countless AI applications, and making them faster directly cuts energy costs and lets more people access them affordably. A 56% speedup with minimal overhead means faster responses for users and significantly lower compute bills for companies running these systems at scale.

Read on arXiv Posted on arXiv · May 4, 2026

Computer Science · AI May 5, 2026

mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

Spotting inflammatory speech across 22 languages before it turns toxic

Dominik Macko, Alok Debnath, Jakub Simko
arXiv:2605.02695

Summary

Researchers built an AI system to detect polarizing content online across 22 languages by finetuning large language models with a technique that keeps computational costs manageable. They strengthened the system by training it on multiple versions of the same text—anonymized, capitalized differently, and with character substitutions—making it more likely to catch polarization even when people use tricks to avoid detection.

Why it matters

Online polarization often escalates into hate speech and social division. Catching inflammatory rhetoric early, across languages and cultures, gives platforms a practical tool to intervene before discussions turn hostile. The approach also shows how to build multilingual AI systems efficiently, without needing expensive computational resources.

Read on arXiv Posted on arXiv · May 4, 2026

Computer Science · AI May 4, 2026

Towards Improving Speaker Distance Estimation through Generative Impulse Response Augmentation

Using artificial sound reflections to help systems pinpoint where speakers are standing

Anton Ratnarajah, Mehmet Ergezer, Arun Nair et al.
arXiv:2605.00721

Summary

Researchers improved distance estimation accuracy by generating synthetic acoustic data to train AI models. The approach reduced localization error by up to 68% across different room types—bringing average errors down from 2.18 meters to 0.69 meters in some settings.

Why it matters

Accurate speaker distance estimation matters for hearing aids, video conferencing systems, and spatial audio applications that need to know where someone is in a room. Real acoustic recordings are expensive and limited; this method shows that artificially generated sound reflections can work just as well for training, making it faster and cheaper to build better location-aware audio systems.

Read on arXiv Posted on arXiv · May 1, 2026

Computer Science · AI May 4, 2026

Position: agentic AI orchestration should be Bayes-consistent

Why AI assistants need better decision-making rules for choosing which tools to use

Theodore Papamarkou, Pierre Alquier, Matthias Bauer et al.
arXiv:2605.00742

Summary

Large language models are good at predicting and reasoning, but bad at making decisions when stakes are high—like choosing which expert to ask or how much to spend. This paper argues that AI systems should use Bayesian probability rules at the control layer that decides which tools to deploy, rather than trying to make the language models themselves fully probabilistic, because this approach is practical and mathematically sound for real-world decisions under uncertainty.

Why it matters

When an AI system decides to call a specialist, request more data, or allocate resources, getting that call wrong can be expensive or risky. Using Bayesian decision theory at the orchestration level means the system tracks what it actually knows, updates beliefs as it gathers information, and chooses actions deliberately rather than by default. This framework also makes human-AI collaboration clearer: humans can see what the system believes and why it made a choice, making the system's reasoning auditable and correctable.

Read on arXiv Posted on arXiv · May 1, 2026

Computer Science · AI May 3, 2026

Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces

Better 3D geometry in AI videos by redesigning how models compress visual information

Andrew Bond, Ilkin Umut Melanlioglu, Erkut Erdem et al.
arXiv:2604.28122

Summary

Video models often generate plausible motion but fail to preserve real 3D geometry and camera movement. Researchers developed S²VAE, which replaces conventional compression methods with a geometry-aware design that forces the model to think in terms of 3D space, depth, and physical structure rather than appearance alone—and showed this approach consistently outperforms existing methods, especially when heavy compression is needed.

Why it matters

Video synthesis systems power everything from robotics simulation to 3D content creation. Models that properly preserve 3D geometry and camera physics produce more realistic, physically plausible outputs and could reduce the need for expensive manual corrections or post-processing. This approach also makes visual models more useful for tasks like autonomous navigation, where physical accuracy isn't optional.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · AI May 3, 2026

Splitting Argumentation Frameworks with Collective Attacks and Supports

Breaking complex arguments into manageable pieces while keeping group logic intact

Matti Berthold, Lydia Blümel, Giovanni Buraglio et al.
arXiv:2604.28112

Summary

Researchers developed new techniques to split apart complex argumentation systems that include both collective attacks (where multiple arguments gang up against one) and supports (where arguments reinforce each other). These splitting methods let computers handle larger, messier real-world arguments by breaking them into smaller pieces while preserving the logical relationships that make arguments work or fail together.

Why it matters

Argumentation systems power AI systems that need to reason through competing claims—from legal judgment automation to medical diagnosis support. Making these systems faster and more scalable by splitting them intelligently means they can handle realistic, large-scale problems rather than toy examples. This is especially important because real arguments rarely come in clean, flat structures; they're full of interdependencies where one claim supports several others while simultaneously being attacked by groups of opposing claims.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · AI May 2, 2026

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Saving computer resources by knowing when AI agents actually need backups

Tianyuan Wu, Chaokun Chang, Lunxi Cao et al.
arXiv:2604.28138

Summary

Most checkpoints of AI agent sandboxes are wasted because existing systems either skip important OS-level side effects or save state after every single action. Crab cuts checkpoint overhead by 87% by intelligently deciding which agent turns actually produce recoverable state—and achieves perfect recovery where naive chat-only approaches fail.

Why it matters

AI agents running in sandboxed containers need frequent backups for fault tolerance and experimentation, but constant checkpointing tanks performance and costs. Crab lets companies run more agents on shared hardware at lower cost while maintaining the ability to recover from failures or rollback bad decisions—turning a system bottleneck into a nonissue.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · cs.AI May 2, 2026

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Testing AI agents on real work that keeps changing, not frozen task lists

Chenxin Li, Zhengyang Tang, Huangxin Lin et al.
arXiv:2604.28139

Summary

AI agents that work across software tools and business systems still struggle with everyday tasks—the best model tested only completed 67% of them. A new benchmark called Claw-Eval-Live tracks what people actually need done rather than relying on static task lists, and grades agents by checking whether they actually executed the work, not just whether they gave a good answer.

Why it matters

Companies increasingly rely on AI agents to handle business workflows like HR tasks and spreadsheet repairs, but current benchmarks don't reflect the real, constantly changing demands these agents face. This benchmark reveals that workflow automation is nowhere near reliable enough for critical business work—and shows that models appearing equally capable on paper can perform very differently on actual tasks, which matters for deciding which AI system to trust with real work.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · cs.AI May 2, 2026

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

Using AI language models to clean up messy brain-wave data for seizure detection

Lincan Li, Zheng Chen, Yushun Dong
arXiv:2604.28178

Summary

Researchers showed that large language models can improve how computers detect seizures from EEG brain scans by cleaning up noisy connections in data networks. Their two-stage approach first builds a graph of brain-signal relationships, then uses an LLM to remove false or redundant connections, achieving better detection accuracy and more interpretable results on standard medical datasets.

Why it matters

Seizure detection is critical for patient safety, but EEG signals are notoriously noisy and hard to analyze accurately. This method improves detection reliability while making the underlying analysis transparent to doctors—important when machine learning decisions directly affect treatment decisions. The approach demonstrates a practical way to combine language models with medical AI, potentially accelerating similar improvements in other brain-imaging diagnostics.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · cs.AI May 2, 2026

PhyCo: Learning Controllable Physical Priors for Generative Motion

Teaching AI to generate videos where objects move and collide realistically

Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan et al.
arXiv:2604.28169

Summary

Video generation models can now create realistic motion and physics interactions—objects bounce properly, materials deform correctly, and friction behaves as expected—by training on 100,000+ simulated videos where physical properties are systematically varied. The system lets users control these physical attributes directly, without needing to reconstruct 3D geometry or run simulations after generation.

Why it matters

Current video AI produces visually plausible but physically nonsensical motion: objects pass through each other, gravity works inconsistently, and materials respond wrongly to forces. PhyCo fixes this at generation time, which matters for video effects in film and games, robot training simulations, and any application where physical accuracy affects downstream decisions. Users can now specify exact friction or material properties and get videos that respect them automatically.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · cs.AI May 2, 2026

Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists

Mapping how AI methods build on each other to help research agents learn faster

Yujun Wu, Dongxu Zhang, Xinchen Li et al.
arXiv:2604.28158

Summary

Researchers created Intern-Atlas, a map of how artificial intelligence research methods have evolved and built upon one another across over 1 million papers. Unlike traditional citation networks that just link papers together, this map explicitly shows why and how new methods emerge from old ones, capturing the specific breakthroughs that prompt researchers to try different approaches.

Why it matters

AI research agents—systems designed to help scientists by reading and synthesizing research—currently struggle to understand how methods are connected because that information is buried in text. Intern-Atlas gives them an explicit roadmap, making it possible for automated systems to suggest promising research directions or identify when a method is ready for a new application. This infrastructure could accelerate how quickly AI researchers iterate on ideas and help catch dead ends before humans invest time in them.

Read on arXiv Posted on arXiv · Apr 30, 2026

Computer Science · cs.AI May 2, 2026

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

Cheap, shareable touch sensors that let robots feel what they grab

Binghao Huang, Yunzhu Li
arXiv:2604.28156

Summary

Researchers built FlexiTac, a low-cost tactile sensing system that gives robot hands the ability to detect pressure and texture through flexible sensor pads and simple electronics. The system costs far less than existing alternatives, works on different types of grippers, and can be manufactured quickly and consistently—making it practical for widespread use in robotics labs and industry.

Why it matters

Robot dexterity has been held back by expensive, fragile touch sensors that few labs can afford or easily integrate into new designs. FlexiTac removes that barrier: its open-source design, low manufacturing cost, and plug-and-play setup mean more researchers can experiment with touch-based learning, and manufacturers can add sensitive manipulation to more types of robots. This could accelerate progress in tasks like assembly, sorting, and manipulation that currently require human workers.

Read on arXiv Posted on arXiv · Apr 30, 2026