Xenova
Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)
- As someone who learnt so much by watching @shiffman.lol's coding videos in high school, I never imagined that one day my own library would feature on his channel! 🥹 If you're interested in learning more about 🤗 Transformers.js, I highly recommend checking it out! 👉 www.youtube.com/watch?v=KR61...
- The next generation of AI-powered websites is going to be WILD! 🤯 In-browser tool calling & MCP is finally here, allowing LLMs to interact with websites programmatically. To show what's possible, I built a demo using Liquid AI's new LFM2 model, powered by 🤗 Transformers.js.
- As always, the demo is open source (which you can find under the "Files" tab), so I'm excited to see how the community builds upon this! 🚀 🔗 Link to demo: huggingface.co/spaces/Liqui...
- Introducing Voxtral WebGPU: State-of-the-art audio transcription directly in your browser! 🤯 🗣️ Transcribe videos, meeting notes, songs and more 🔐 Runs on-device, meaning no data is sent to a server 🌎 Multilingual (8 languages) 🤗 Completely free (forever) & open source
- That's right, we're running Mistral's new Voxtral-Mini-3B model 100% locally in-browser on WebGPU, powered by Transformers.js and ONNX Runtime Web! 🔥 Try it out yourself! 👇 huggingface.co/spaces/webml...
- A community member trained a tiny Llama model (23M parameters) on 3 million high-quality @lichess.org games, then deployed it to run entirely in-browser with 🤗 Transformers.js! Super cool! 🔥 It has an estimated ELO of ~1400... can you beat it? 👀 (runs on both mobile and desktop)
- Model: huggingface.co/lazy-guy12/c... Online demo: lazy-guy.github.io/chess-llama/
- We did it! Kokoro TTS (v1.0) can now run 100% locally in your browser w/ WebGPU acceleration. Real-time text-to-speech without a server. ⚡️ Generate 10 seconds of speech in ~1 second for $0. What will you build? 🔥
- The most difficult part was getting the model running in the first place, but the next steps are simple: ✂️ Implement sentence splitting, enabling streamed responses 🌍 Multilingual support (only phonemization left) Who wants to help? 🤗 huggingface.co/spaces/webml...
- Introducing Kokoro.js, a new JavaScript library for running Kokoro TTS, an 82 million parameter text-to-speech model, 100% locally in the browser w/ WASM. Powered by 🤗 Transformers.js. WebGPU support coming soon! 👉 npm i kokoro-js 👈 Link to demo (+ sample code) in 🧵
- You can get started in just a few lines of code! 🧑💻 Huge kudos to the Kokoro TTS community, especially taylorchu for the ONNX exports and Hexgrad for the amazing project! None of this would be possible without you all! 🤗 Try it out yourself: huggingface.co/spaces/webml...
- The model is also extremely resilient to quantization. The smallest variant is only 86 MB in size (down from the original 326 MB), with no noticeable difference in audio quality! 🤯 Link to models/samples: huggingface.co/onnx-communi...
- Is this the future of AI browser agents? 👀 WebGPU-accelerated reasoning LLMs are now supported in Transformers.js! 🤯 Here's MiniThinky-v2 (1B) running 100% locally in the browser at ~60 tps (no API calls)! I can't wait to see what you build with it! Demo + source code in 🧵👇
- For the AI builders out there: imagine what could be achieved with a browser extension that (1) uses a powerful reasoning LLM, (2) runs 100% locally & privately, and (3) can directly access/manipulate the DOM! 👀 💻 Source code: github.com/huggingface/... 🔗 Online demo: huggingface.co/spaces/webml...
- First project of 2025: Vision Transformer Explorer I built a web app to interactively explore the self-attention maps produced by ViTs. This explains what the model is focusing on when making predictions, and provides insights into its inner workings! 🤯 Try it out yourself! 👇
- The app loads a small DINOv2 model into the user's browser and runs it locally using Transformers.js! 🤗 This means you can analyze your own images for free: simply click the image to open the file dialog. E.g., the model recognizes that long necks and fluffy ears are defining features of llamas! 🦙
- Vision Transformers work by dividing images into fixed-size patches (e.g., 14 × 14), flattening each patch into a vector and treating each as a token. It's fascinating to see what each attention head learns to "focus on". For example, layer 11, head 1 seems to identify eyes. Spooky! 👀
-
View full threadThis project was greatly inspired by Brendan Bycroft's amazing LLM Visualization tool – check it out if you haven't already! Also, thanks to Niels Rogge for adding DINOv2 w/ Registers to transformers! 🤗 Source code: github.com/huggingface/... Online demo: huggingface.co/spaces/webml...
- Introducing Moonshine Web: real-time speech recognition running 100% locally in your browser! 🚀 Faster and more accurate than Whisper 🔒 Privacy-focused (no data leaves your device) ⚡️ WebGPU accelerated (w/ WASM fallback) 🔥 Powered by ONNX Runtime Web and Transformers.js Demo + source code below! 👇
- Huge shout-out to the Useful Sensors team for such an amazing model and to Wael Yasmina for his 3D audio visualizer tutorial! 🤗 💻 Source code: github.com/huggingface/... 🔗 Online demo: huggingface.co/spaces/webml...
- Reposted by Xenova🤗NEW PIECE: ‘Open-source’ is becoming a buzzword for many aspects of modern journalism, including open-source AI. But what is it, and how can journalists benefit from it? @marinaadami.bsky.social spoke to @fdaudens.bsky.social to find out. reutersinstitute.politics.ox.ac.uk/news/journal...