Owen Marschall: 13/X This interference occurs for two reasons: 1) the two task manifolds share a common pool of neurons and 2) the network is nonlinear. The task-related dynamics share an average neuronal gain factor. When one task is active, the average neuronal gains decrease, weakening the other task’s dynamics.

Owen Marschall omarschall.bsky.social · Dec 15, 2025
1/X Excited to present this preprint on multi-tasking, with @david-g-clark.bsky.social and Ashok Litwin-Kumar! Timely too, as “low-D manifold” has been trending again. (If you read thru the end, we escape Flatland and return to the glorious high-D world we deserve.) www.biorxiv.org/content/10.6...
A theory of multi-task computation and task selection

Neural activity during the performance of a stereotyped behavioral task is often described as low-dimensional, occupying only a limited region in the space of all firing-rate patterns. This region has...

biorxiv.org

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
2/X Recent experimental evidence and compelling Bluesky posts are complicating the low-D manifold story—not that it isn’t true when measured in the context of of a single, low-complexity task, as Gao et al. (biorxiv 2017) argue pretty much has to happen...

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
3/X ... but that there seem to be different manifolds that the network state occupies depending on what task is performed (Sabatini et al. Nat Comm 2024, Borgognon et al. Nat Comm 2025).

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
4/X One study in particular beautifully decomposes neural activity into distinct subspaces whose dynamics represent “subskills” that are flexibly activated and combined to perform complex force patterns (Amematsro et al. biorxiv 2025).

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
5/X These subspaces are heterogeneously oriented and contain fundamentally different dynamics, yet involve overlapping sets of neurons. (Figure from Amematsro et al.)

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
6/X Let’s take this picture at face value: within a given task, neural activity occupies a low-D manifold, but the network switches to different manifolds to perform different tasks. How is this possible? How do the connectivity structures supporting these dynamics avoid interfering with each other?

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
7/X Theoretical work on multi-tasking has typically involved trained RNNs. We opt for an analytic approach to gain insight on the rel'p between connectivity and dynamics, building on the theory of low-rank networks (Ostojic stuff), which are a tractable model for nonlin. dynamics on a low-D manifold

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
8/X In particular, we ask what happens when we linearly superpose different connectivity matrices, each of which is constrained to be low rank (R) and would generate, on its own, some nonlinear dynamical system on some low-dimensional manifold.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
9/X The answer is surprisingly simple: the “strongest” (in a sense we make precise) latent dynamical system wins and controls the dynamics of the whole network, while the weaker dynamics are suppressed to the origin. This happens for any initial condition.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
10/X In this example, the two connectivities we superposed would have produced stable limit cycle dynamics and bistable dynamics, respectively, if each were the sole network connectivity. When combined (previous post), the limit cycle "wins" because it's marginally stronger in this case.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
11/X Thus we identified a kind of “interference” between task-related connectivity components, and it’s fairly one-sided: at most one task’s associated latent dynamical system does its thing, while every other's is suppressed to the origin and does nothing at all.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
12/X Why is this the case? The connectivity is generated so that manifolds across tasks are approx. orthogonal (exactly for number of neurons N -> infinity). Can these dynamics not peacefully coexist in orthogonal subspaces, a mechanism proposed for continual learning (Duncker et al. Neurips 2020)?

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social
13/X This interference occurs for two reasons: 1) the two task manifolds share a common pool of neurons and 2) the network is nonlinear. The task-related dynamics share an average neuronal gain factor. When one task is active, the average neuronal gains decrease, weakening the other task’s dynamics.
Dec 15, 2025 19:41
0 reposts 0 quotes 0 likes

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
14/X From the perspective of flexible behavior, this is a problem. If for a given connectivity, one task always wins, how does a network use dynamics A for task A and dynamics B for task B? We’ll get there after a brief detour, but sneak peek: it’s through modulating connectivity.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
15/X Detour: if we add in more and more task-related components to the network connectivity, i.e. make the number of them comparable to the number of neurons N, we can enter a regime where not even one task dominates, and instead there are chaotic fluctuations in every task subspace simultaneously.

View on Bluesky Download image (1)Download image (2)Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
16/X The fluctuations in each subspace can be described by our theory as a subspace-specific linear dynamical system driven by noise. But there is no explicitly added noise—it emerges from a large number of task-related subspaces slightly overlapping and producing effectively random cross-talk.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
17/X Interestingly, these chaotic dynamics arise just from summing many task-related components—there is no unstructured background connectivity in this model. Moreover, the chaos itself isn’t unstructured but has signatures of the associated task dynamics in each subspace simultaneously.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
18/X We think of this chaotic state as the “spontaneous” state of the network, where no task is activated. A task activates when the noisy, linearized dynamics lose stability, so that the associated latent variables grow exponentially (before nonlinearly self-stabilizing) to dominate the network.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
19/X This can be achieved through modulating the strength of the associated connectivity component. Because any one connectivity component is low rank, this can be biologically implemented via gain modulation of an external loop, eg through thalamus (as in Logiaco et al Cell Reports 2021).

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
20/X This provides a mechanism for selecting dynamics. We identify 3 regimes *per task,* over modulation of the overall strength of that task’s connectivity component: the spontaneous state, then the chaotic and nonchaotic task-selected states.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
21/X In both task-selected states, there is strong activity in the subspace of the selected task. The chaotic task-selected state features both coherent task dynamics (noiseless to leading order) as well as fluctuations in single-neuron rates comparable in magnitude to their task-related tuning.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
22/X About the selection mechanism: yes we are modulating connectivity itself. Yes this is arguably "cheating" the multi-task challenge, traditionally thought of as a fixed-connectivity network prompted by inputs to do different things (Yang et al Nat Neuro 2019, Driscoll et al Nat Neuro 2024).

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
23/X But 1) Maybe the brain does this? 2) The modulation required is *subtle,* a vanishingly small fraction of the overall weight matrix, and is itself low-dimensional. And yet is sufficient to induce large-scale activity changes, because it operates via a phase transition.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
24/X As promised, let’s examine the dimension (participation ratio) of these states' activity patterns. The spontaneous state is high-dimensional, in the sense that its dimension scales with the size of the network N. For larger and larger networks, this dimension can grow without bound.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
25/X A different but similar-in-vibe observation has been made in experimental work, that *measuring* more neurons leads to higher dimension (figure from Manley et al Neuron 2024), which admittedly isn’t quite the same as increasing the number of neurons that exist.

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
26/X Nonetheless, in our spontaneous state each neuron adds a new (fraction of a) dimension since neurons fluctuate approximately independently of one another, with a proportionality constant that is a nonlinear function of the network parameters and can be quite small (see Clark et al PRX 2025).

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
27/X The task-selected states are much lower D, as so much of their variance is captured by just the handful of selected-task dimensions. In the chaotic task-selected states, fluctuations lead to marginally higher (less low?) dim. that can exceed task dim. R, but not in a way that scales with N.

View on Bluesky Download image (1)Download image (2)Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
28/X This suggests that, even if we include trial-to-trial fluctuations that are approximately independent between neurons, we won’t get high dimensionality just from the existence (and measurement) of a large number of neurons, if we measure while restricted to just a single task context.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
29/X Although any one task component generates low-D activity when selected, recall that these task manifolds are randomly oriented with respect to one another. If we measure over sequential activation of many different, individually low-D task-selected states, we recover high-D activity overall.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
30/X This can even exceed the dimension of the spontaneous state, of course depending on a few things (how big N is, how many different task-selected states are chosen, etc.).

View on Bluesky Download image Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
31/X Even in cases when the “switching among different tasks” setup has higher overall dimension than the spontaneous state, the rate at which this measurement grows wrt recording time is slower (previous post). This is especially true the longer the network lingers in each task-specific subspace.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
32/X We conservatively used fairly quick task-switching intervals to not artificially magnify this effect, and we still see slower dimension growth in this setup than in the spontaneous state. Spontaneous activity maximally quickly explores the dimensions available to it.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
33/X This model elucidates clearly two hypotheses for the origin of high-dimensional neural activity. One is spontaneous fluctuations in the absence of some coherent behavior. Another is switching among many different, individually low-dimensional, behavioral states.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
34/X In the former, high dimensionality emerges simply because the network is big. In the latter, high dimensionality emerges because the network is doing a lot of different things. Both are viable!

View on Bluesky Download image (1)Download image (2)Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
35/X Borrowing the language of “behavioral syllables” (Markowitz et al. Nature 2023), we can formulate a few predictions. Measured during periods of no behavior, neural activity should be fairly high-D. Measured over many repeats of a single behavioral syllable, neural activity should be low-D.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
36/X Pooled across behavioral syllables, we should recover high dimensionality, with higher dimension the more types of behavioral syllables that are observed. But we should observe lower growth of dimension per unit recording time, compared to pooling across periods of no behavior.

View on Bluesky Show all post labels
Owen Marschall omarschall.bsky.social · Dec 15, 2025
37/X Overall, really enjoyed working on this with @david-g-clark.bsky.social and Ashok, and I’d love to chat about any part of it—the details behind the theory, the experimental implications, etc. Thanks for reading!

View on Bluesky Show all post labels

A theory of multi-task computation and task selection