Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

Quantitative Biology May 25, 2026

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

How neural networks organize meaning exactly like human brains do

Dongxin Guo, Jikun Wu, Siu Ming Yiu
arXiv:2605.23035

Summary

Researchers decoded how large language models like GPT-2 internally organize semantic information and discovered that this organization mirrors the structure of the human brain's language regions. Semantic features alone explained 94% of how well the model predicted brain responses to language, and five specific semantic categories aligned precisely with five distinct brain regions known from neuroscience.

Why it matters

This finding bridges a major gap between how AI language models work and how human brains process language. It shows that the brain's semantic architecture isn't arbitrary—it emerges naturally when systems learn to understand language—which could help neuroscientists understand language processing and AI researchers build models that align more closely with biological intelligence.

Read on arXiv Posted on arXiv · May 21, 2026