PAPER PLAINE

Fresh research, simply explained. Updates twice daily.

Mechanism-Guided Selective Unlearning for RLVR-Induced Reasoning

Forgetting specific skills in AI without breaking everything else

Researchers developed MAST, a technique that selectively removes unwanted reasoning patterns from AI models while preserving their useful abilities. On math-focused AI models, MAST successfully made the system forget targeted skills (reducing correct answers on a test set from 45 to 37 out of 150) while keeping other math knowledge intact—something that completely failed when researchers tried to erase the same patterns from the whole model at once.

AI systems sometimes develop reasoning shortcuts or behaviors their creators want to remove. Current methods for erasing these unwanted patterns often damage the model's general abilities, making it worse overall. MAST offers a surgical alternative that could let companies fix problematic AI behavior without rebuilding or retraining from scratch—potentially saving time and computational cost while making AI systems safer and more reliable.