Sai Prasanna
See(k)ing the surreal
Causal World Models for Curious Robots @ University of Tübingen/Max Planck Institute for Intelligent Systems 🇩🇪
#reinforcementlearning #robotics #causality #meditation #vegan
- Use Beta NLL for regression when you also predict standard deviations, a simple change to NLL that works reliably better.
- If open-endedness has to be fundamentally subjectively measured, what are the factors of the agent makes it so if we fix humans as the final arbiter or evaluator. Does embodiment/action space etc of the agent matter for a human evaluator of open-endedness?
- Tübingen: Freiburg:: Introvert:Extrovert
- Tübingen
- Freiburg
- But this is from the vibes of Tübingen from 1.5 days of visit. I have lived in Freiburg for 3 years
- Reposted by Sai PrasannaThis might be the most fun I’ve had writing an essay in a while. Felt some of that old going-nuts-with-an-idea energy flowing. open.substack.com/pub/contrapt...
- Reposted by Sai PrasannaThis week's #PaperILike is "Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming" (Bertsekas 2024). If you know 1 of {RL, controls} and want to understand the other, this is a good starting point. PDF: arxiv.org/abs/2406.00592
- I realized how I background process tonnes of information, from work/research and emotional stuff. And it works well, leads to good research ideas, wise processing of tough situations! But It's so hard to learn to trust this as conscious thinking for solving problems feels more under my "control"
- One strategy I guess is to have good stream of good (BS filter) and diverse (topics, areas) inputs (books, research papers, what not) And not get bogged by the fact that I am too distracted to go deep into one input stream (book or podcast or article or paper) at a time
- TIL: "Clever Hans cheat" for next-token prediction. A subtle but interesting issue with next-token prediction. In the purely forward next token prediction objective, teacher forcing can lead to learning dynamics where the models don't even generalize "in-distribution"!! arxiv.org/abs/2403.06963
- This is orthogonal to the more well-known compounding error problem in auto-regression and distribution mismatch issue in teacher forcing.
- This failure occurs in distribution, not OOD. And it apparently is general for any model learning next-token prediction regardless of recurrence (linear or otherwise) or attention!!!
-
View full threadConditioning gap in latent space world models is due to how uncertainty can go into latent posterior distribution or the learnt prior (dynamics model) and not conditioning on the future would put the uncertainty incorrectly into dynamics model.
- Break the Monday Productivity ceiling with this super awesome 4 hour techno set on.soundcloud.com/hXTcWTTsYUNK...
- This album is going to be timeless open.spotify.com/album/32yQDx...
- Their aesthetic is soooo gooooood m.youtube.com/watch?v=hGQu...
- Monday kick starter open.spotify.com/track/6QXjBA...
- Caffinate and hard techno to keep the pace going 🔥 open.spotify.com/track/5WLHfd...
- If I have a really good photo that could be potentially used in many contexts, what's the best place to make money with it? My friend has a really good eye for photos and we want to try a side venture selling some of her stuff
- Reposted by Sai PrasannaRIP Manmohan Singh. Dude changed all our lives in 1991 for the better. His stint as turnaround finance minister was revolutionary even if his later stint as PM was rather hapless (for which Nehru dynasty is more to blame).
- Reposted by Sai PrasannaLooks like a cool study. Lots to learn from ants about large scale coordination www.pnas.org/doi/10.1073/... "Our results exemplify how simple minds can easily enjoy scalability while complex brains require extensive communication to cooperate efficiently." h/t @petersuber.bsky.social
- Wednesday Quirky mood open.spotify.com/track/3RBhQ7...
- Does augmenting ourselves with V/LLMs to cognitive gaps make self actualization even more difficult on average? Stands stark in contrast with (more difficult/slower to show positive outcomr) augmentation strategies like meditation or psychedelics
- For example, for people whom it takes effort and energy to read facial expressions and empathise emotionally instead of cognitively, one can see AR glasses with VLM support easily fixing the baseline ability. But it comes at a cost of further letting direct emotional sensitivity degrade
- Maybe its not either/or, it depends on the level at which these things can be customised? But the gap between direct neuro response augmentation/shift and indirect ways feel unsurmountable, atleast in the AR glasses type interface. Maybe direct neuro augmentation devices in the future changes it
-
View full threadMass effect 3 ending "choices" don't feel so bad or unrealistic seen in this light of ultimate end of augmenting ourselves with sophisticated tools and gradually tools that seem more like us in some ways. I picked the merge.
- Reposted by Sai PrasannaThe slides for my lectures on (Bayesian) Active Learning, Information Theory, and Uncertainty are online now 🥳 They cover quite a bit from basic information theory to some recent papers: blackhc.github.io/balitu/ and I'll try to add proper course notes over time 🤗