François Fleuret
Research Scientist Meta/FAIR, Prof. University of Geneva, co-founder Neural Concept SA. I like reality.
https://fleuret.org
- I asked "on the other platform" what were the most important improvements to the original 2017 transformer. That was quite popular and here is a synthesis of the responses:
- - Prenorm: normalization in the residual blocks before the attention operation and the FFN respectively - GQA (Group Query Attention): more Q than (K, V)
- "You are in Paris, enjoy the city, stop obsessing with AI" Paris:
- Maybe the wall was the friends we made during that journey Ted.
- Reposted by François Fleuret[Not loaded yet]
- J'étais l'invité du journal de 19h30 sur la @radiotelesuisse.bsky.social ce soir pour parler d'Intelligence Artificielle. www.rts.ch/play/tv/19h3...
- It is hard to overstate how cool and powerful is flex attention. @chhillee.bsky.social pytorch.org/blog/flexatten… TL;DR: it is an implementation of the attention operator in pytorch that allows in particular to efficiently "carve" the attention matrix. 1/3
- Reposted by François Fleuret[Not loaded yet]
- Happy new year you all! 2025 is certainly full of promise.
- Happy Christmas you all!
- Whatever you say about the whole field of AI: It's not boring.
- Some tools that keep me sane on Mac: rectangleapp.com karabiner-elements.pqrs.org
- It's Friday!
- Oh boy, GTA 6 has to be good. And Half Life 3.
- This is very great.