Aran Nayebi
Assistant Professor of Machine Learning, Carnegie Mellon University (CMU)
Building a Natural Science of Intelligence 🧠🤖
Prev: ICoN Postdoctoral Fellow @MIT, PhD @Stanford NeuroAILab
Personal Website: cs.cmu.edu/~anayebi
- I'll be presenting my work *today* on the first formal guarantees addressing the decade-long open problem of Corrigibility (namely how we provably avoid loss of control with AI) in the AAAI Machine Ethics workshop (W37) at 15:15 pm ST in Tourmaline 207-209!
- For those who can't make it: Slides: anayebi.github.io/files/slides... Blogpost summary: www.lesswrong.com/posts/M5owRc...
- If you're attending AAAI, I'll be presenting this work on alignment barriers *today* as an Oral presentation in the Special Track on AI Alignment at 11 am ST in conference room J!
- Are there fundamental barriers to AI alignment once we develop generally-capable AI agents? We mathematically prove the answer is *yes*, and outline key properties for a "safe yet capable" agent. 🧵👇 Paper: arxiv.org/abs/2502.05934
- For those who can't make it, here's a pre-recording: www.youtube.com/watch?v=ZAoP...
- Blogpost summary: www.lesswrong.com/posts/M5owRc...
- Reposted by Aran NayebiInspired by the natural curiosity he saw in animals, MLD Assistant Professor @anayebi.bsky.social and his CMU colleagues created a virtual zebrafish that acted like a real zebrafish without any prior training.
- It was a pleasure speaking at the inaugural BAMΞ Mathematical Phenomenology Sprint, where I discussed reverse-engineering natural intelligence with embodied agents and how NeuroAI could inform a science of subjective experience and welfare.
- This talk also discusses our NeuroAI Turing Test: bsky.app/profile/anay...
- As well as our recent NeurIPS '25 work on embodied agents & intrinsic motivation: bsky.app/profile/reec...
- 1/ I'm excited to share recent results from my first collaboration with the amazing @anayebi.bsky.social and @leokoz8.bsky.social ! We show how autonomous behavior and whole-brain dynamics emerge in embodied agents with intrinsic motivation driven by world models.
- And our NeurIPS '25 Oral on tactile processing: bsky.app/profile/trin...
- 1/ What if we make robots that process touch the way our brains do? We found that Convolutional Recurrent Neural Networks (ConvRNNs) pass the NeuroAI Turing Test in currently available mouse somatosensory cortex data. New paper by @Yuchen @Nathan @anayebi.bsky.social and me!
- We have 2 papers accepted to #AAAI2026 this year! The first paper 👇 on intrinsic barriers to alignment (establishing no free lunch theorems of encoding "all human values" & the inevitability of reward hacking) will appear as an *oral* presentation at the Special Track on AI Alignment.
- Are there fundamental barriers to AI alignment once we develop generally-capable AI agents? We mathematically prove the answer is *yes*, and outline key properties for a "safe yet capable" agent. 🧵👇 Paper: arxiv.org/abs/2502.05934
- 10 min video summary here: www.youtube.com/watch?v=ZAoP...
- It was an absolute pleasure giving the University of Toronto Robotics Institute seminar on "Using Embodied Agents to Reverse-Engineer Natural Intelligence". Check out the recording here: www.youtube.com/watch?v=E4Qm...
- Amazing talk last week by Dr. Aran Nayebi at #UofT on reverse-engineering the brain and building neuroscience-inspired AI. #neuroAI #compneuro @anayebi.bsky.social @utoronto.ca @uoftcompsci.bsky.social @vectorinstitute.ai
- Thank you to my wonderful & generous host @drlaschowski.bsky.social not only for showing me around the beautiful campus -- but also leading the faculty group chat to help me find the hallowed location of where AlexNet was originally developed (ultimately leading to Hinton being pinged to confirm)!
- Feel free to check out my new LessWrong post for a high-level summary of our two AAAI papers! "From Barriers to Alignment to the First Formal Corrigibility Guarantees" www.lesswrong.com/posts/M5owRc...
- We have 2 papers accepted to #AAAI2026 this year! The first paper 👇 on intrinsic barriers to alignment (establishing no free lunch theorems of encoding "all human values" & the inevitability of reward hacking) will appear as an *oral* presentation at the Special Track on AI Alignment.
- Feel free to check out my new LessWrong post for a high-level summary of this work! www.lesswrong.com/posts/dP8J6v...
- ...and that's a wrap for Fall 2025! In the final lecture of the semester, Matt Gormley & I covered bleeding-edge research topics in Generative AI, namely Interactive World Models + Science of AI Alignment. Next semester we plan to have our recordings publicly available on YouTube -- stay tuned!
- Course website: www.cs.cmu.edu/~mgormley/co... All lecture slides (publicly available) & topics: www.cs.cmu.edu/~mgormley/co... Video recordings are currently available to anyone at CMU! (These may become publicly available in the near future, will update if we do!)
- Matt's slides on Interactive World Models: www.cs.cmu.edu/~mgormley/co... My slides on the Science of AI Alignment: www.cs.cmu.edu/~mgormley/co...
- In today's Generative AI lecture, we cover code generation & autonomous agents, discussing how Github Co-Pilot works, diving into multimodal agents (like Gemini 3 Pro!), and ending on AI scientists & AI for science. Lots more to explore in this rapidly growing space!
- Slides: www.cs.cmu.edu/~mgormley/co... Full course info: bsky.app/profile/anay...
- Course website: www.cs.cmu.edu/~mgormley/co... All lecture slides (publicly available) & topics: www.cs.cmu.edu/~mgormley/co... Video recordings are currently available to anyone at CMU! (These may become publicly available in the near future, will update if we do!)
- Reposted by Aran NayebiJoin us December 5th at University of Toronto (in-person and online) for a special seminar by Dr. Aran Nayebi on reverse-engineering the brain and building neuroscience-inspired artificial intelligence. #neuroAI #compneuro @anayebi.bsky.social @utoronto.ca @uoftcompsci.bsky.social
- In today's Generative AI lecture, we dive into reasoning models by dissecting how DeepSeek-R1 works (GRPO vs. PPO, which removes the need for a separate value network + training with a simpler rule-based reward), and end on mechanistic interpretability to better understand those reasoning traces.
- Slides: www.cs.cmu.edu/~mgormley/co... Full course info: bsky.app/profile/anay...
- Course website: www.cs.cmu.edu/~mgormley/co... All lecture slides (publicly available) & topics: www.cs.cmu.edu/~mgormley/co... Video recordings are currently available to anyone at CMU! (These may become publicly available in the near future, will update if we do!)
- In today's Generative AI lecture, we primarily discuss scaling laws and the key factors that go into building large-scale foundation models. Slides: www.cs.cmu.edu/~mgormley/co... Full course info: bsky.app/profile/anay...
- We also discuss data quality & amount (where you get great performance with a smaller model trained on lots of tokens), how to get good data depending on your application, and Moravec's paradox for robotics foundation models.
- Finally, we briefly discuss Querying Transformers for text-image alignment, as a hold-over from last lecture on multimodal foundation models!
- Congratulations to my Ph.D. student Reece Keller for winning the best talk award at #CRSy25 on our project building the first task-optimized autonomous agent that predicts whole-brain data! Check out the post below for other cool talks!! Detailed summary: bsky.app/profile/reec...
- 🐟 @reecedkeller.bsky.social @cmu.edu explored autonomous behaviour in virtual zebrafish, where intrinsic motivation drives self-directed exploration.
- Full paper (to appear in NeurIPS 2025!) here: arxiv.org/abs/2506.00138
- Congrats to this year's Nobel Prize winners! Philippe's seminal work is in fact what our recent closed form UBI AI capability threshold builds on: bsky.app/profile/anay...
- My ILIAD ’25 talk, “Intrinsic Barriers & Pathways to Alignment”: why “aligning to all human values” provably can’t work, why reward hacking is inevitable in large state spaces, & how small value sets bypass “no free lunch” limits to yield formal corrigibility. www.youtube.com/watch?v=Oajq...
- Thanks @undo-hubris.bsky.social for the invite & for hosting! Slides: anayebi.github.io/files/slides... Paper 1 (alignment barriers): arxiv.org/abs/2502.05934 Paper 1 summary: bsky.app/profile/anay... Paper 2 (corrigibility): arxiv.org/abs/2507.20964 Paper 2 summary: bsky.app/profile/anay...
- Are there fundamental barriers to AI alignment once we develop generally-capable AI agents? We mathematically prove the answer is *yes*, and outline key properties for a "safe yet capable" agent. 🧵👇 Paper: arxiv.org/abs/2502.05934
- A nice application of our NeuroAI Turing Test! Check out @ithobani.bsky.social's thread for more details on comparing brains to machines!
- 1/X Our new method, the Inter-Animal Transform Class (IATC), is a principled way to compare neural network models to the brain. It's the first to ensure both accurate brain activity predictions and specific identification of neural mechanisms. Preprint: arxiv.org/abs/2510.02523
- Honored to be quoted in this @newsweek.com article discussing how AI could accelerate the need for UBI. Read more here: www.newsweek.com/ai-taking-jo...
- Academic paper: bsky.app/profile/anay...
- In today's Generative AI lecture, we talk about all the different ways to take a giant auto-complete engine like an LLM and turn it into a useful chat assistant.
- Specifically, we cover methods which don't involve parameter-updating, e.g. In-Context Learning / Prompt-Engineering / Chain-of-Thought Prompting, to methods that do, such as Instruction Fine-Tuning & building on IFT to perform full-fledged Reinforcement Learning from Human Feedback (RLHF).
- Next time we discuss how to optimize these reward models via DPO/policy gradients! Slides: www.cs.cmu.edu/~mgormley/co... Full course info: bsky.app/profile/anay...
- Course website: www.cs.cmu.edu/~mgormley/co... All lecture slides (publicly available) & topics: www.cs.cmu.edu/~mgormley/co... Video recordings are currently available to anyone at CMU! (These may become publicly available in the near future, will update if we do!)
- In today's Generative AI lecture, we discuss the 4 primary approaches to Parameter-Efficient Fine-Tuning (PEFT): subset, adapters, Prefix/Prompt Tuning, and Low-Rank Adaptation (LoRA). We show each of these amounts to finetuning a different aspect of the Transformer.
- Slides: www.cs.cmu.edu/~mgormley/co... Full course info: bsky.app/profile/anay...
- Course website: www.cs.cmu.edu/~mgormley/co... All lecture slides (publicly available) & topics: www.cs.cmu.edu/~mgormley/co... Video recordings are currently available to anyone at CMU! (These may become publicly available in the near future, will update if we do!)
- 1/6 Recent discussions (e.g. Rich Sutton on @dwarkesh.bsky.social’s podcast) have highlighted why animals are a better target for intelligence — and why scaling alone isn’t enough. In my recent @cmurobotics.bsky.social seminar talk, “Using Embodied Agents to Reverse-Engineer Natural Intelligence”,
- 2/6 I present a cohesive framework that develops these notions further, grounded in both machine learning and experimental neuroscience. In it, I outline our efforts over the past 4 years to set the capabilities of humans & animals as concrete engineering targets for AI.
- 3/6 By grounding agents in perception, prediction, planning, memory, and intrinsic motivation — and validating them against large-scale neural data from rodents, primates, and zebrafish — we show how neuroscience and machine learning can form a unified *science of intelligence*.
-
View full thread6/6 I close with reflections on AI safety and alignment, and the Q&A explores open questions: from building physically accurate (not just photorealistic) world models to the role of autoregression and scale. 🎥Watch here: www.youtube.com/watch?v=5deM... Slides: anayebi.github.io/files/slides...
- Excited to have this work accepted as an *oral* to NeurIPS 2025!
- 1/ What if we make robots that process touch the way our brains do? We found that Convolutional Recurrent Neural Networks (ConvRNNs) pass the NeuroAI Turing Test in currently available mouse somatosensory cortex data. New paper by @Yuchen @Nathan @anayebi.bsky.social and me!
- Check out our accompanying open-source library! bsky.app/profile/anay...
- Excited to have this work accepted to NeurIPS 2025! See you all in San Diego!
- 1/ I'm excited to share recent results from my first collaboration with the amazing @anayebi.bsky.social and @leokoz8.bsky.social ! We show how autonomous behavior and whole-brain dynamics emerge in embodied agents with intrinsic motivation driven by world models.