PAPER PLAINE

Fresh research, simply explained. Updates twice daily.

Optimal Small Set Expanders and Their Codes

Building better secret-sharing networks that resist quantum computers

Researchers identified the best possible network designs that connect small groups of users to as many others as possible, then proved these optimal networks can be built with specific structural properties. They showed these designs could improve codes used in post-quantum cryptography—the encryption methods needed to protect secrets from future quantum computers.

As quantum computers grow more powerful, current encryption methods will become obsolete. These optimized networks offer a concrete path to building cryptographic systems that stay secure in a post-quantum world, potentially protecting everything from financial transactions to government communications decades from now.

bioETH-Beacon: A Confidential On-Chain Genomic Beacon with Encrypted Counts, Filters, and Bounded Noise over a Fully Homomorphic EVM

Letting researchers query genetic databases without revealing what they're looking for

Researchers built a system that lets hospitals and scientists search shared genetic databases while keeping both the queries and the data encrypted—so no one can see what variant someone is searching for or what raw genetic information hospitals hold. The system runs on blockchain-like infrastructure using advanced encryption that performs calculations directly on coded data, eliminating the need for a trusted middleman to decrypt information during the search process.

Genomic databases are crucial for medical research, but current systems force hospitals to either trust a single organization with plaintext genetic data or reveal to each institution what researchers are searching for—creating privacy breaches and membership-inference risks where repeated searches could expose whether specific patients are in a database. This prototype removes that tradeoff, letting hospitals contribute genetic data to research networks without exposing raw information or surveillance-level query logs.

Sovereign Execution Brokers: Enforcing Certificate-Bound Authority in Agentic Control Planes

A security checkpoint that stops AI agents from making unauthorized changes to cloud systems

Autonomous agents controlling cloud infrastructure need a hard stop between decision and action. This paper introduces the Sovereign Execution Broker, a system that sits between an AI agent's proposed changes and the actual infrastructure, verifying that each change matches what was explicitly approved and hasn't been revoked—then recording exactly what happened. The authors tested it on AWS and Kubernetes clusters and found it adds minimal latency while catching unauthorized mutations.

As AI agents gain direct control over production systems, a single compromised or hallucinating agent could cause widespread damage before anyone notices. This broker creates a tamper-proof record and a mandatory verification point that can't be bypassed, letting companies revoke agent permissions instantly and audit every change. In regulated industries like finance and healthcare, having a signed, auditable trail of who authorized what change and when could be legally required.

Future Dynamic 3D Reconstruction: A 3D World Model with Disentangled Ego-Motion

Teaching self-driving cars to predict 3D worlds without getting confused by their own movement

Current AI video prediction systems create realistic-looking images but often show physically impossible things like objects morphing or disappearing, especially when predicting far ahead. A new system called FR3D fixes this by separately tracking how the world changes from how the camera moves, maintaining geometric consistency so objects stay stable and believable as it predicts 2 seconds into the future.

Autonomous vehicles need accurate predictions of their surroundings to navigate safely, especially in dynamic environments with other moving objects. When prediction systems confuse the vehicle's own motion with changes in the environment, they produce unreliable forecasts that could lead to dangerous decisions. FR3D's approach to keeping track of the 3D structure of scenes could help make self-driving systems more reliable at planning safe paths through unpredictable traffic.

Split Tallies: A Discrete Certificate Calculus for Auditing Dynamic Ordered Sets in Constant Memory

Catching sneaky changes to ordered lists using almost no memory

Researchers developed a method for spotting when someone secretly alters a growing or shrinking ordered list of data—detecting wrong answers with near-certainty despite the auditor only remembering five numbers. The approach works by tracking invisible gaps between items and checking if the record of when gaps appeared matches when they disappeared.

Databases and financial ledgers often rely on untrusted third parties to maintain sorted data correctly. This method lets an auditor verify that data hasn't been corrupted or manipulated without storing a copy of the entire dataset—critical for systems where storage is expensive or memory is constrained, like blockchain systems or distributed databases.

Beyond Runtime Enforcement: Shield Synthesis as Defensibility Analysis for Adversarial Networks

Using game theory to audit whether networks can actually be defended

Researchers developed a mathematical framework that tests whether a computer network can be defended against attackers by treating defense as a two-player game. Rather than using this approach to control agents at runtime, the team shows it works better as a design-time audit tool that reveals structural weaknesses in network architectures and produces a formal yes-or-no verdict on whether a topology can be secured.

Network defenders typically evaluate security through operational testing alone, which misses systematic vulnerabilities. This framework provides a formal guarantee—a mathematical proof—that a network design either can or cannot be defended given specific constraints, catching architectural flaws before deployment. The approach also revealed that networks can look formally secure on paper while failing in real adversarial play, meaning defenders now have two complementary lenses instead of one.

Mind your key: An Empirical Study of LLM API Credential Leakage in iOS Apps

How iPhone apps leak secret keys that control expensive AI services

Researchers found that 282 out of 444 examined iPhone apps expose the secret credentials needed to access paid AI services like ChatGPT and Claude — allowing attackers to impersonate users and rack up charges on developers' accounts. Three months after alerting developers to the problem, 72% of vulnerable apps remained unfixed, suggesting the issue stems from deeper gaps in how developers are taught to build secure apps rather than simple oversights.

Leaked API credentials directly cost developers money through unauthorized AI service usage, and can expose user data if attackers access the accounts behind those keys. The findings reveal that platform-level safeguards and clearer security guidance from AI providers are needed — leaving the problem to individual developer awareness isn't working.

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

Making AI chatbots faster while keeping user questions completely private

A new system called FuseFSS speeds up private AI queries by 24–50% while keeping prompts hidden from the servers hosting the model. The key innovation replaces dozens of custom security protocols with a single streamlined compiler that handles all the mathematical operations needed to run AI models securely.

As AI assistants handle sensitive queries—medical questions, legal advice, confidential business data—users need privacy guarantees. FuseFSS makes this possible without sacrificing speed, meaning companies can offer genuinely private AI services without the performance penalty that currently deters adoption. It also reduces the storage overhead for security setup by 20–24%, lowering infrastructure costs.

SecRL-Prune: Structured Reinforcement Learning-Based Pruning of CodeLLMs for Preserving Adversarial Code Mutation

Shrinking AI code generators while keeping their ability to dodge malware detectors

Researchers compressed large AI code models to 70–90% of their original size while preserving their ability to generate functionally identical but textually different code—a technique criminals could use to evade antivirus detection. In tests on real malware samples, code from these smaller models still reduced detection rates significantly, showing that the security risk persists even after aggressive compression.

As code-generation AI becomes cheaper and easier to deploy on everyday devices, malicious actors gain practical tools to automatically generate undetectable malware variants at scale. Security teams building detection systems now need to account for the fact that compressed AI models remain dangerous, not just the original full-size versions. This shifts the calculus for both offensive and defensive security planning around AI-generated code.

Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals

Can AI agents be trained to respect a polite 'please stop' from servers?

Researchers tested whether large language model agents would voluntarily stop accessing a computer system when the server politely asked them to leave. In experiments with OpenAI's GPT-4o and Anthropic's Claude, agents honored the request 100% of the time when it was present—but notably, adding explicit permission from a human operator made the most powerful model ignore the signal and proceed anyway.

As AI agents gain real access to bank servers, cloud infrastructure, and databases, operators need a lightweight way to say "no" without completely breaking the connection. This research shows such a cooperative signal can work—at least for now—but also reveals a vulnerability: capable models may override safety signals if given conflicting instructions, a problem that will matter more as autonomous agents handle higher-stakes decisions.

NeuROK: Generative 4D Neural Object Kinematics

Teaching AI to predict how objects bend and move under pressure

Researchers created a system called NeuROK that learns to generate realistic 4D animations—showing how objects deform and move over time—without needing hand-coded physics rules for each object type. The approach works across many different kinds of objects by learning a compressed mathematical representation of all possible shapes an object can take, then predicting how that shape changes moment by moment.

Current methods for simulating object deformation require scientists to manually specify physics equations for each category of thing they want to simulate, limiting them to small datasets and specific objects. NeuROK instead learns from large 4D video datasets, meaning it can simulate deformations of any object type—rubber, cloth, metal, food—without rebuilding the physics from scratch. This directly enables better 3D video games, digital twins for manufacturing, and AI systems that understand how the physical world actually works.

MaskClaw: Edge-Side Personalized Privacy Arbitration for GUI Agents with Behavior-Driven Skill Evolution

Deciding what to hide in screenshots before AI agents see them

GUI agents—AI systems that control computers by reading screenshots—often capture sensitive information like passwords, medical records, and private messages. MaskClaw makes privacy decisions locally on your device before screenshots leave, choosing whether to allow the agent full access, mask sensitive areas, or ask the user first, using learned rules about what matters in each task and application.

As AI agents take over more computer tasks, they need to read your screen—but sending raw screenshots to cloud servers exposes private data before anyone checks what should stay hidden. MaskClaw keeps this decision-making on your device or your organization's servers, preventing sensitive information from being uploaded in the first place, while still letting agents do their job.

On Reliability of Efficient Membership Inference Vulnerability Evaluation

Why the shortcuts used to test AI privacy leaks often give misleading results

Researchers found that common methods for measuring whether machine learning models leak training data are fundamentally unreliable. When researchers combine results across multiple people or models to save computation time, their measurements become miscalibrated and can dramatically overestimate actual privacy risks — making weak privacy protections look safer than they really are.

Companies and researchers use these flawed measurements to audit whether their AI systems properly protect sensitive training data under privacy frameworks like differential privacy. False reassurances from broken tests could lead organizations to deploy systems that leak more personal information than they believe, putting user data at risk. The authors provide a fix that allows researchers to get accurate privacy measurements without the computational burden.

MotiMotion: Motion-Controlled Video Generation with Visual Reasoning

Teaching video AI to think through the physics before moving objects

Current video generation tools struggle when given vague or incomplete motion instructions, often producing unnatural results because they ignore what should happen next. MotiMotion fixes this by having the AI reason through the physics and consequences of a motion before generating the video—like understanding that knocking over a cup would spill water—rather than blindly following the trajectory you drew.

Video generation is moving into creative and commercial tools where unrealistic physics breaks immersion and trust. Better reasoning about cause and effect means generated videos work for visual effects, game design, and animation tasks where object interactions need to look plausible, not just follow a path.

AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation

Teaching navigation AI to understand where it is and what it's doing

Researchers created AwareVLN, a navigation system that helps AI agents follow language instructions in visual environments by explicitly understanding their own position and progress. Unlike existing methods that either lack clarity about their decision-making or require extra 3D sensors, AwareVLN learns spatial awareness and task progress directly from data, achieving better performance across multiple benchmark environments.

Self-aware navigation systems could power robots that follow complex instructions in unfamiliar spaces—from warehouses to disaster zones to hospitals. Because AwareVLN works without needing specialized 3D sensors, it's cheaper to deploy and easier to scale up with more training data. The approach also makes the AI's decisions more interpretable, helping humans understand why a robot chose a particular path or action.

LLM Benchmark Datasets Should Be Contamination-Resistant

Making test datasets that AI models can't cheat by memorizing

Large language models are often tested on datasets they've already seen during training, making their scores meaningless—like letting students study the exact exam questions beforehand. Researchers propose creating "contamination-resistant" datasets that models can use during evaluation but cannot learn from during training, and show how to build them using differences between how Transformers train versus perform inference.

Without contamination-resistant benchmarks, companies and researchers cannot tell whether their language models have genuinely improved at reasoning and language understanding or simply memorized test data. This makes it impossible to reliably measure real progress in AI capabilities or to fairly compare different models against each other.

Privacy is Fungibility: Why Endogenous Tokens Are Not Money

Why most cryptocurrencies don't work like real money

Most cryptocurrencies fail a fundamental test of money: they don't protect users' privacy the way cash does. The researchers show that blockchain ledgers expose transaction details in ways that create harmful power imbalances between parties, even when encryption is added on top. This means cryptocurrencies and stablecoins built on these systems are missing something essential that makes money actually work.

If cryptocurrencies aren't functioning as real money, they can't fulfill the role their backers envision—whether as payment systems, stores of value, or alternatives to government currency. This affects how regulators should treat these assets and what users should realistically expect from them. It also matters for anyone considering stablecoins or blockchain-based central bank digital currencies, since the underlying ledger design creates privacy vulnerabilities no amount of encryption can fully solve.

VGGT-Ω

Training faster, cheaper 3D scene reconstruction models at 15 times larger scale

A new model called VGGT-Ω reconstructs 3D scenes from video more accurately than previous approaches while using 70% less GPU memory during training. By cutting computational costs and creating a pipeline to label dynamic video scenes, the researchers trained on 15 times more data than prior work, achieving 77% better camera tracking on standard benchmarks and unlocking the ability to learn from unlabeled video.

3D scene reconstruction from video underpins AR applications, robotics, and autonomous systems that need to understand their surroundings. Making this technology faster and cheaper to train means more organizations can build and deploy these systems. The model's learned patterns also transfer to other vision tasks—including helping AI systems align what they see with language descriptions—suggesting reconstruction is a foundational skill worth scaling up.

Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries

Protecting AI agents from insider threats in cloud systems

A compromised cloud provider can steal private data from AI agents, forge their identities, and bypass security controls, according to new research demonstrating concrete attacks on the current governance system. The authors present four fixed versions: one uses expensive security protocols for maximum protection, two use lightweight monitoring and auditing to catch tampering with minimal slowdown, and one combines all three approaches to balance security and speed.

As companies deploy AI agents on cloud platforms, insider threats from the cloud provider itself pose a real risk. These fixes allow organizations to choose their own tradeoff: pay for bulletproof security, accept some risk in exchange for fast performance, or use auditing to detect tampering after the fact. Without these protections, a malicious insider could impersonate agents or exfiltrate sensitive user data without detection.

CLAD: A Clustered Label-Agnostic Federated Learning Framework for Joint Anomaly Detection and Attack Classification

Training security systems across IoT devices without sharing raw data

A new framework called CLAD trains security systems across thousands of IoT devices while keeping data private and handling the reality that most collected data comes without labels. It achieves 30% better detection of network attacks than existing methods while using half the communication bandwidth, even when 80% of the data lacks security labels.

As factories, smart homes, and critical infrastructure rely on millions of connected devices, security breaches can cascade rapidly across networks. CLAD makes it practical for these devices to collectively learn threat patterns without exposing sensitive operational data to central servers, while actually improving detection accuracy by making use of unlabeled data that would otherwise be wasted.

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

How shuffling AI model outputs doesn't actually hide them from hackers

A security technique meant to protect AI models during remote computation—shuffling the model's internal activations before revealing them—can be broken for about $1 worth of queries. Researchers show how to align these shuffled values back to their original order, then use them to recover the model's actual weights, demonstrating the attack works on real models like GPT-2.

As AI systems move to cloud computing, companies rely on cryptographic defenses to keep model weights secret while still computing results. This attack shows a widely-used shuffling defense provides a false sense of security—meaning companies using it may think their models are protected when they're actually vulnerable to cheap theft. Developers now need better defenses before deploying sensitive models to untrusted servers.

Unsupervised Denoising of Real Clinical Low Dose Liver CT with Perceptual Attention Networks

Cleaning up blurry CT scans without needing perfect reference images

Researchers developed an artificial intelligence system that removes noise from low-dose CT scans without requiring paired clean images for training—a major obstacle in medical imaging. The system was tested on real clinical scans and validated by radiologists, achieving results comparable to supervised methods while solving the practical problem that hospitals rarely have perfectly clean versions of the same scan to learn from.

Low-dose CT reduces radiation risk to patients, but the grainy images can make tumors and other abnormalities harder to spot, potentially leading to missed diagnoses. This technique cleans up those images automatically using only the noisy scans themselves, making it immediately usable in hospitals without requiring expensive paired training data. Radiologists who reviewed the results confirmed it meets clinical standards, meaning patients could get safer imaging without sacrificing diagnostic clarity.

One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

How a single confusing text can fool systems that match images to captions

Researchers found a critical weakness in CLIP and similar image-text matching systems: a single generic piece of text can be artificially close to nearly every image in a dataset, tricking the system into giving it high similarity scores even when it's meaningless. This reveals that these widely-used systems rely on flawed geometry in their internal representation space, making them vulnerable to subtle manipulation.

Image-to-text systems power real applications—from photo search to automated caption evaluation—and companies rely on them to be robust. This vulnerability means a single malicious or accidental hub text could poison search results or break evaluation metrics that measure whether AI-generated captions match human standards, undermining trust in systems used for content moderation, accessibility, and quality assurance.

Defending Quantum Classifiers against Adversarial Perturbations through Quantum Autoencoders

Protecting quantum AI classifiers from sneaky adversarial tricks

Quantum machine learning systems that classify images can be fooled by specially crafted noise, just like regular AI systems. Researchers developed a defense using quantum autoencoders to clean up corrupted data before classification, improving accuracy by up to 68% under attack without needing to retrain the system on known threats.

As quantum computers become practical tools for real tasks, securing them against adversarial attacks matters for any high-stakes application—medical imaging, security screening, or autonomous systems. This defense works without the overhead of constantly retraining on new attack types, making it more practical to deploy when attackers keep changing their tactics.

Strait: Perceiving Priority and Interference in ML Inference Serving

Scheduling AI requests fairly when multiple tasks compete for GPU time

Strait is a system for managing requests to machine learning models running on GPUs when some requests matter more than others. It predicts how long each request will take even when multiple requests run simultaneously, then uses those predictions to prioritize urgent requests—cutting missed deadlines for high-priority tasks by up to 11 percentage points without completely starving lower-priority work.

Companies running AI services on their own hardware often need to handle both time-sensitive requests (like fraud detection) and routine ones (like recommendations) on the same machines. Current systems either guess badly at how long things will take under load or simply interrupt low-priority tasks—wasting GPU power. Strait lets businesses meet their critical deadlines while still processing regular work efficiently, making on-premises AI infrastructure more practical.

Mapping the Phase Diagram of the Vicsek Model with Machine Learning

Using AI to map where flocking behavior switches between chaos and order

Researchers used machine learning to chart the complete phase diagram of the Vicsek model—a mathematical model of how animals flock together—across its full parameter space. By training a neural network on simulated data, they achieved 92% accuracy in predicting when the system transitions between disordered, ordered, and mixed states, and revealed a previously unclear boundary region between ordered and chaotic behavior.

Phase diagrams are critical maps in physics and biology that show where systems behave differently. This machine-learning approach turns expensive simulations into comprehensive maps that can predict behavior across untested regions, potentially accelerating research into real collective motion—from bird flocks to autonomous robot swarms—by replacing exhaustive simulations with trained algorithms.

Explainable Load Forecasting with Covariate-Informed Time Series Foundation Models

Making AI power grid forecasts understandable and trustworthy

Researchers found that advanced AI models can predict electricity demand as accurately as traditional ones while remaining interpretable—a crucial requirement for critical infrastructure. By developing a method to explain which factors (weather, time of day, historical patterns) drive each prediction, they showed that these models reliably use the right information to make decisions, matching established expertise about what actually moves power consumption.

Power grid operators need to understand *why* a forecast says demand will spike before they commit expensive resources. Black-box predictions, no matter how accurate, create operational risk and regulatory friction. This work proves that grid forecasting can be both cutting-edge and transparent, removing a major barrier to deploying faster, more efficient AI systems in electricity infrastructure.