François Fleuret
Research Scientist Meta/FAIR, Prof. University of Geneva, co-founder Neural Concept SA. I like reality.
https://fleuret.org
- I asked "on the other platform" what were the most important improvements to the original 2017 transformer. That was quite popular and here is a synthesis of the responses:
- - Prenorm: normalization in the residual blocks before the attention operation and the FFN respectively - GQA (Group Query Attention): more Q than (K, V)
- - RMSNorm instead of Layernorm: normalize only the scaling - MLA (Multi-head Latent Attention): stores a low-rank projection of the attention block input and compute the KV from it - SwiGLU: non-linearity for the FFN block with per-component gating
-
View full thread- Ring Attention: takes advantage of multi-node hardware to scale the computation according to the sequence length - Speculative decoding: a cheaper model generates tokens, and a rejection process corrects this generation to march the full-model distribution.
- "You are in Paris, enjoy the city, stop obsessing with AI" Paris:
- Maybe the wall was the friends we made during that journey Ted.
- J'étais l'invité du journal de 19h30 sur la @radiotelesuisse.bsky.social ce soir pour parler d'Intelligence Artificielle. www.rts.ch/play/tv/19h3...
- It is hard to overstate how cool and powerful is flex attention. @chhillee.bsky.social pytorch.org/blog/flexatten… TL;DR: it is an implementation of the attention operator in pytorch that allows in particular to efficiently "carve" the attention matrix. 1/3
- It does this by generating an optimized cuda kernel on the fly. So it's cool for causal masks, but it also allows an amazing trick to deal with batches of sequences of various lengths *without padding*! 2/3
- To do so, you concatenate all the sequences to make a batch of a single sequence, and carve the attention matrix into a block-diagonal one (possibly with causal structure in each block) so that sequences cannot look at each other. Magic! 3/3
- Happy new year you all! 2025 is certainly full of promise.
- Happy Christmas you all!
- Whatever you say about the whole field of AI: It's not boring.
- Some tools that keep me sane on Mac: rectangleapp.com karabiner-elements.pqrs.org
- It's Friday!
- Oh boy, GTA 6 has to be good. And Half Life 3.
- This is very great.