Tom Schaul
RL researcher at DeepMind
schaul.site44.com 🇱🇺
- Reposted by Tom SchaulHi RL Enthusiasts! RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026! Call for Papers is up now: Abstract: Mar 1 (AOE) Submission: Mar 5 (AOE) Excited to see what you’ve been up to - Submit your best work! rl-conference.cc/callforpaper... Please share widely!
- Could we meta-learn which data to train on? Yes! Does this make LLM training more efficient? Yes! Would you like to know exactly how? arxiv.org/pdf/2505.17895 (come see us at NeurIPS too!)
- Where do some of Reinforcement Learning's great thinkers stand today? Find out! Keynotes of the RL Conference are online: www.youtube.com/playlist?lis... Wanting vs liking, Agent factories, Theoretical limit of LLMs, Pluralist value, RL teachers, Knowledge flywheels (guess who talked about which!)
- Reposted by Tom Schaul[Not loaded yet]
- Deadline to apply is this Wednesday!
- Ever thought of joining DeepMind's RL team? We're recruiting for a research engineering role in London: job-boards.greenhouse.io/deepmind/job... Please spread the word!
- Ever thought of joining DeepMind's RL team? We're recruiting for a research engineering role in London: job-boards.greenhouse.io/deepmind/job... Please spread the word!
- The RL team is a small team led by David Silver. We build RL algorithms and solve ambitious research challenges. As one of DeepMind's oldest teams, it has been instrumental in building DQN, AlphaGo, Rainbow, AlphaZero, MuZero, AlphaStar, AlphaProof, Gemini, etc. Help us build the next big thing!
- When faced with a challenge (like debugging) it helps to think back to examples of how you've overcome challenges in the past. Same for LLMs! The method we introduce in this paper is efficient because examples are chosen for their complementarity, leading to much steeper inference-time scaling! 🧪
- Some extra motivation for those of you in RLC deadline mode: our line-up of keynote speakers -- as all accepted papers get a talk, they may attend yours! @rl-conference.bsky.social
- 200 great visualisations: 200 facets and nuances of 1 planetary story.
- My annual decarbonization presentation is here. 200 slides, covering everything from water levels in Lake Gatún to sulfur dioxide emissions to ESG fund flows to Chinese auto exports to artificial intelligence. www.nathanielbullard.com/presentations
- Reposted by Tom Schaul[Not loaded yet]
- Reposted by Tom SchaulExcited to announce the first RLC 2025 keynote speaker, a researcher who needs little introduction, whose textbook we've all read, and who keeps pushing the frontier on RL with human-level sample efficiency
- Could language games (and playing many of them) be the renewable energy that Ilya was hinting at yesterday? They do address two core challenges of self-improvement -- let's discuss! My talk is today at 11:40am, West Meeting Room 220-222, #NeurIPS2024 language-gamification.github.io/schedule/
- Are there limits to what you can learn in a closed system? Do we need human feedback in training? Is scale all we need? Should we play language games? What even is "recursive self-improvement"? Thoughts about this and more here: arxiv.org/abs/2411.16905
- Don't get to talk enough about RL during #neurips2024? Then join us for more, tomorrow night at The Pearl!
- This year's (first-ever) RL conference was a breath of fresh air! And now that it's established, the next edition is likely to be even better: Consider sending your best and most original RL work there, and then join us in Edmonton next summer!
- Are there limits to what you can learn in a closed system? Do we need human feedback in training? Is scale all we need? Should we play language games? What even is "recursive self-improvement"? Thoughts about this and more here: arxiv.org/abs/2411.16905
- I'll also be giving a talk about this at the @neuripsconf.bsky.social workshop on "Language Gamification" in two weeks. Pop by if you're around! language-gamification.github.io
- Reposted by Tom Schaul[Not loaded yet]