Daphne Cornelisse
PhD student at NYU | Building human-like agents | daphne-cornelisse.com
- What if you could train agents on a 𝗱𝗲𝗰𝗮𝗱𝗲 of driving experience in 𝘂𝗻𝗱𝗲𝗿 𝗮𝗻 𝗵𝗼𝘂𝗿, on a single GPU? Excited to share 𝙋𝙪𝙛𝙛𝙚𝙧𝘿𝙧𝙞𝙫𝙚 2.0: A fast, friendly driving simulator with RL training via PufferLib at 𝟯𝟬𝟬𝗞 𝘀𝘁𝗲𝗽𝘀/𝘀𝗲𝗰 🐡 + 🚗 youtu.be/LfQ324R-cbE?...
- Several fast evals are included, too! Check out our release post: emerge-lab.github.io/PufferDrive/... Work done with Spencer Cheng* (co-first), Pragnay Mandavilli, Julian Hunt, Kevin Joseph, Waël Doulazmi, Valentin Charraut, Aditya Gupta, Joseph Suarez, and @eugenevinitsky.bsky.social
- Reposted by Daphne CornelisseExcited to share a new preprint, accepted as a spotlight at #NeurIPS2025! Humans are imperfect decision-makers, and autonomous systems should understand how we deviate from idealized rationality Our paper aims to address this! 👀🧠✨ arxiv.org/abs/2510.25951 a 🧵⤵️
- Rapid RL experimentation is great. But how do you catch silent errors before they slip by? In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.
- Reposted by Daphne CornelisseThe single biggest epistemic challenge in the internet era is remaining calibrated about what "normal" people think while the internet throws up an infinite wall of crazy. Thousands of people sharing an absurd opinion on the internet tells you very little!
- Overnight runs are the overnight oats of research — prep, forget, and rewarding by morning
- Reposted by Daphne CornelisseBuilding a "human-level" simulated driver that zero-shot generalizes to many benchmarks: a fun interview with @natolambert.bsky.social www.youtube.com/watch?v=2Q66...
- Sim agents are key for developing autonomous systems for safety-critical systems, like self-driving cars. We're open-sourcing sim agents that achieve a 99.8% success rate with < 0.8% failures on the Waymo Dataset. These agents are built through scaling self-play.
- SOTA generative models trained on large human datasets show unintended behaviors like crashes (5-6%) and off-road events (6-12%) in benchmarks for nominal driving. Unpredictable deviations make it hard to separate signal from noise.
- We train sim agents using self-play PPO on 10K+ scenarios from the Waymo Open Dataset in GPUDrive, under a semi-realistic framework for human perception and control. Agents learn goal-directed behavior, avoiding collisions and staying on the road.
-
View full threadThis was joint work with Aarav Pandya, Kevin Joseph, Joseph Suárez, and @eugenevinitsky.bsky.social
- GPUDrive got accepted to ICLR 2025! With that, we release GPUDrive v0.4.0! 🚨 You can now install the repo and run your first fast PPO experiment in under 10 minutes. I’m honestly so excited about the new opportunities and research the sim makes possible. 🚀 1/2
- Huge thanks to my incredible collaborators for making this possible: Saman Kazemkhani, Aarav Pandya, @eugenevinitsky.bsky.social , Joseph Suarez for converting the sim to a package and optimizing the PPO loop, and Kevin Joseph for all his help with data processing, tutorials, and more! 😊
- Link to repo: github.com/Emerge-Lab/g...