Adina Yakup
AI Research @Hugging Face 🤗
Contributing to the Chinese ML community.
- AI for science is moving fast🚀 Intern-S1-Pro 🔬 a MoE multimodal scientific reasoning model from Shanghai AI Lab , now live on @hf.co huggingface.co/internlm/Int... ✨ 1T total / 22B active ✨ Apache 2.0 ✨ SoTA scientific reasoning performance
- ✨ China’s open source AI ecosystem has entered a new phase This final blog in the series examines how leading Chinese AI organizations are evolving ,and what this implies for the future of open source. huggingface.co/blog/hugging...
- LongCat Image-Edit-Turbo is now live on @hf.co huggingface.co/meituan-long... It's the distilled version of LongCat-Image-Edit from Meituan Longcat team , achieving a 10x speedup🚀
- GLM just entered the OCR field🔥 huggingface.co/zai-org/GLM-... ✨ 0.9B ✨ MIT licensed ✨ Multimodal GLM-V architecture ✨ #1 on OmniDocBench v1.5 (94.62)
- Step 3.5 Flash 🔥 new foundation model from StepFun AI huggingface.co/collections/... ✨ Sparse MoE:196B/11B active ✨ Supports up to 256K context ✨ Multi-token prediction for fast decoding (100–300 tok/s) ✨ Runs locally on consumer hardware
- What a week 🤯 Following DeepSeek, Kimi, Qwen, Baidu, and Ant Group, Unitree Robotics has now released a VLA model on the hub too! huggingface.co/unitreerobot...
- Qwen3-ASR is out🚀 huggingface.co/collections/... ✨ 0.6B & 1.7B - Apache2.0 ✨ 30 languages + 22 Chinese dialects, plus English accents across regions ✨ Single model for language ID + ASR (no extra pipeline stitching) ✨ Qwen3-ForcedAligner-0.6B, a strong forced aligner outperforming E2E baselines
- Baidu just released a new VLM : PaddleOCR-VL-1.5🔥 huggingface.co/PaddlePaddle... ✨ 0.9B - Apache 2.0 ✨ 94.5% OmniDocBench v1.5 ✨ Multilingual OCR: strong on rare characters & ancient texts
- Ant Group is on fire 🔥 After a VLA and a depth perception foundation model, here comes a new world model! huggingface.co/robbyant/lin... ✨ Minute-long rollouts at 16 FPS ✨ Structured camera and action control ✨ Apache2.0
- LongCat-Flash-Lite🔥 a non-thinking MoE model released by Meituan LongCat team huggingface.co/meituan-long... ✨ Total 68.5B / 3B active - MIT license ✨ 256k context ✨ Faster inference with N-gram embeddings
- Ant Group is going big on robotics 🤖 They just dropped their first VLA and depth perception foundation model on @hf.co
- Blog 2 is live 🔥 After the DeepSeek R1 moment, what came next wasn’t just more models. huggingface.co/blog/hugging... In this second post, we dive into the architectural and hardware choices shaping China’s open AI ecosystem.
- Big day in open source AI!! ✨ DeepSeek released OCR2 💥 huggingface.co/deepseek-ai/... ✨ Kimi K2.5 just landed 🔥 huggingface.co/moonshotai/K... With the Chinese Spring Festival 3 weeks away, what’s coming next?👀
- Kimi K2.5 from Moonshot AI is more than just another large model 🤯 huggingface.co/collections/... ✨ Native multimodality : image + video + language + agents 💥 ✨1T MoE / 32B active ✨ 256K context ✨ Modified MIT license ✨ Agent Swarm execution ✨ Open weights + open infra mindset
- Qwen just released Qwen3-TTS 🔊 huggingface.co/collections/... ✨ VoiceDesign, CustomVoice & Base : from custom voices to rapid voice cloning 🎙️ ✨ 0.6B & 1.7B - Apache2.0 ✨ 10 language support ✨ SOTA 12 Hz speech tokenizer for high compression & high-fidelity audio
- AgentCPM-report 🔥 local DeepResearch agent released from OpenBMB huggingface.co/openbmb/Agen... ✨ 8B - Apache 2.0 ✨ Gemini-2.5-Pro level DeepResearch report generation ✨ Fully offline, privacy-first local deployment ✨ + GGUF version
- DeepSeek R1 dropped one year ago 🐳 and a lot has changed. With Irene Solaiman, we’re launching a blog series on @hf.co about how that moment reshaped AI + open source in 2025, starting with strategic shifts and the explosion of new open models in China! huggingface.co/blog/hugging...
- Zhipu just released a powerful lightweight option of GLM 4.7 ✨ 30B total/3B active - MoE huggingface.co/zai-org/GLM-...
- Another Chinese model fully trained on domestic chips, released by China Telecom 👀 huggingface.co/Tele-AI/Tele... TeleChat3-36B-Thinking: ✨ Native support for the Ascend + MindSpore ecosystem ✨ Inspired by DeepSeek’s architecture design, bringing training stability and efficiency gains.
- After a VLM, StepFun dropped a new audio model: Step-Audio-R1.1, enabling thinking while speaking 🔥 huggingface.co/stepfun-ai/S... ✨ Apache 2.0 ✨ Combines dual-brain architecture and acoustic-grounded reasoning to enable real-time dialogue with SOTA-level reasoning
- EvoCUA 🔥 multimodal agent that can actually operate a computer, released by Meituan huggingface.co/collections/... ✨ 8B & 32B - Apache2.0 ✨ Multi-turn control of Chrome, Excel, PowerPoint, VSCode ✨ 32B reaches 56.7% on OSWorld
- From ChatGPT Healthcare to Claude for healthcare, AI in medicine is speeding up🚀 Now BaichuanAI joins with Baichuan-M3 🏥 open medical LLM trained for clinical decision-making huggingface.co/collections/... ✨ 235B - Apache2.0 ✨ Lower hallucinations via Fact-Aware RL ✨ Built for long medical chats
- DeepSeek’s new work: Engram 🔥 Beyond MoE, it adds lookup style conditional memory to LLMs. Paper: github.com/deepseek-ai/... Can’t wait to see what’s coming next 👀
- AgentCPM-Explore🔥 on device agent foundation model released by OpenBMB huggingface.co/openbmb/Agen... ✨ 4B - Apache2.0 ✨ Supports 100+ multi-turn environment interactions with search + verification ✨ Full training/inference stack is openly shared as well
- Based on 2025 Chinese AI Timeline, here are some interesting takeaways: huggingface.co/spaces/zh-ai...
- Skywork has been quiet for the past few months, now they're back with Unipic 3 🔥 Consistency Model + Distribution Matching Distillation for fast, high-quality image editing, efficient to train and run. huggingface.co/collections/... ✨ Both models are under MIT license
- Wechat AI is shipping! WeDLM 🔥 A new language model that generates tokens in parallel, making it faster than standard LLMs , with the same Transformer setup! ✨ 7B/8B - Base & Instruct ✨ Apache2.0 huggingface.co/collections/...
- Qwen just released two new model series: Qwen3-VL-Embedding & Qwen3-VL-Reranker 🚀 ✨ 2B / 8B - Apache2.0 ✨ 30+ languages ✨ Supported text, images, screenshots, videos, and arbitrary multimodal combinations huggingface.co/collections/... huggingface.co/collections/...
- DeepSeek updated R1 paper from 22 to 86 pages! And added voice input to the App What's next? 👀 huggingface.co/papers/2501....
- Chinese open source AI in December 2025 was about the stack coming together: open, end to end, and ready to ship 🔥 huggingface.co/collections/... Like everyone else, I was OOO at the end of December, so feel free to share (in comments or PR) any I missed in this list!
- Youtu-LLM 🔥 2B model with 128K context and strong agentic abilities from Tencent. huggingface.co/tencent/Yout...
- MiniMax M2.1 blog is out🔥 huggingface.co/blog/MiniMax...
- GLM-4.7 is live on @huggingface🔥 New, Open, MIT-licensed agentic model from Zhipu! huggingface.co/collections/...
- Qwen-Image-Layered 🖼️ an open model that decomposes images into explicit RGBA layers ✨ Apache 2.0 huggingface.co/Qwen/Qwen-Im...
- Following up on LLaDA 2.0 , the paper is now out on Daily Papers🔥 It has sparked a lot of discussion in the community for showing how discrete diffusion LLMs can scale to 100B and run faster than traditional AR models. huggingface.co/papers/2512....
- New model from Meituan 🚀 LongCat-Video-Avatar🔥 Audio driven character animation with text, image, and video inputs, all in one! huggingface.co/meituan-long...
- MiMo-V2-Flash🔥New MoE model from Xiaomi huggingface.co/XiaomiMiMo/M...
- New work from MiniMax Visual Tokenizer Pre-training (VTP) 🔥 a framework that helps visual tokenizers learn better representations for diffusion-based image generation. huggingface.co/collections/...