Really excited to share our new preprint led by
@ahmadsamara.bsky.social with Zaid Zada,
@vanderlab.bsky.social, and Uri Hasson titled "Cortical language areas are coupled via a soft hierarchy of model-based linguistic features"
doi.org/10.1101/2025... Jun 24, 2025 23:25Language comprehension emerges from the coordinated activity of a number of different brain areas. Different cortical language areas surely make unique contributions to language processing, but why do we observe such overwhelming functional similarity across regions?
We propose that different regions of the language network are coupled to one another via a multidimensional space of shared linguistic features. This shared space can serve as a "communication channel" to coordinate the contributions of different regions to the overall network.
This hypothesis is inspired by converging ideas of a
"communication subspace" in systems neuroscience (e.g., Semedo et al., Neuron, 2019) and the high-dimensional "residual stream" linking layers in LLMs (e.g., Elhage et al., Anthropic, 2021).
We used fMRI data collected while subjects listened to naturalistic spoken stories and extracted three types of linguistic embeddings for the same stimuli from the Whisper speech and language model. We call these "acoustic", "speech", and "language" embeddings.
Traditional within-subject functional connectivity metrics cannot distinguish between intrinsic and extrinsic (i.e., stimulus-driven) co-fluctuations between brain regions.
Following the logic of intersubject correlation (ISC) analysis, intersubject functional connectivity (ISFC) isolates stimulus-driven connectivity between regions (e.g., in response to naturalistic stimuli)—but is agnostic to the content of the stimulus shared between regions.
Put differently, ISFC can tell us *where* and *how much* connectivity is driven by the stimulus, but not *what* stimulus features are driving the connectivity. How can we begin to unravel what linguistic features are shared across different language regions?
We developed a model-based framework for quantifying stimulus-driven, feature-specific connectivity between regions. We used parcel-wise encoding models to align feature-specific embeddings to brain activity and then evaluated how well these models generalize to other parcels.
We show that early auditory areas are coupled to intermediate language areas via lower-level acoustic and speech features. In contrast, higher-order language and default-mode regions are predominantly coupled through more abstract language features.
We observe a clear progression of feature-specific connectivity from early auditory to lateral temporal areas, advancing from acoustic-driven connectivity to speech- and finally language-driven connectivity.
Taking a slightly different approach, we assess how well specific model features capture larger-scale patterns of connectivity. We find that feature-specific model connectivity partly recapitulates stimulus-driven cortical network configuration.
Our findings suggest that different language areas are coupled via a mixture of linguistic features—this yields what we refer to as a "soft hierarchy" from lower-order to higher-order language areas, and may facilitate efficient, context-sensitive language processing.
We're very excited to share this work and happy to hear your feedback! If you're attending
#OHBM2025,
@ahmadsamara.bsky.social will be presenting a poster on this project (poster #804, June 25 and 26)—be sure to stop by and chat with him about it!
www.biorxiv.org/content/10.1...https://www.biorxiv.org/content/10.1101/2025.06.02.657491v1
Very cool, congrats and looking forward to reading this!