When in Doubt, Plan It Out: Committed Small Language Model Deliberation for Reactive Reinforcement Learning

Computer Science · AI Jun 16, 2026

When in Doubt, Plan It Out: Committed Small Language Model Deliberation for Reactive Reinforcement Learning

Pairing quick AI reflexes with slow, careful thinking for better decisions

Nathan Gavenski, Juarez Monteiro, Francisco Galuppo et al.
arXiv:2606.16995

Summary

A hybrid system called PACT combines a fast, instinctive AI policy with a small language model that stops to think and plan. When the AI encounters unfamiliar situations, it calls on the language model to generate and test action plans before committing to them, dramatically outperforming either approach alone on difficult navigation tasks.

Why it matters

AI systems deployed in the real world—robots, autonomous vehicles, safety-critical systems—often fail when they encounter situations they weren't trained on. PACT shows that adding a deliberative planning step can catch and prevent these failures without retraining the core system, making existing AI safer and more reliable when conditions change unexpectedly.

Read on arXiv Posted on arXiv · Jun 15, 2026