PAPER PLAINE

Fresh research, simply explained. Updates twice daily.

Self-Augmenting Retrieval for Diffusion Language Models

Using a language model's uncertain guesses to find better information faster

Discrete diffusion language models generate text by repeatedly refining all words at once, discarding low-confidence predictions at each step. Researchers discovered these rejected words actually contain valuable clues about what information the model will need, and built a system called SARDI that uses these clues to retrieve relevant facts during generation. On five question-answering benchmarks, SARDI outperformed existing methods while running up to 8 times faster.

Retrieval-augmented systems currently have to choose what to look up before finalizing answers, often missing crucial facts or wasting computation on irrelevant searches. SARDI solves this by peeking at the model's working process to retrieve information more intelligently—delivering more accurate answers in the same time, or the same answers much faster. This matters for applications like research assistants or chatbots that need both speed and accuracy.