Yuan Tang
Senior Principal Software Engineer at Red Hat AI | Open Source Leader at KServe, Argo, Kubeflow, Kubernetes, CNCF | Maintainer of XGBoost, TensorFlow | Keynote Speaker | Author | Technical Advisor
More info: http://terrytangyuan.xyz
We are hiring!
- View my verified achievement from @linuxfoundation. www.credly.com/badges/9814c... via @credly
- View my verified achievement from @linuxfoundation. www.credly.com/badges/6bdde... via @credly
- Excited to share that I'll be speaking at #KubeCon Europe in Amsterdam! You can find me in the following sessions: 1. Cloud Native AI + Kubeflow Day: Welcome + Opening Remarks: sched.co/2DZN3 2. Project Lightning Talk: Evolving KServe: sched.co/2EFyW
- 3. Advancing Kubernetes AI Conformance: Current State and Roadmap: sched.co/2EF76 4. Evolving KServe: The Unified Model Inference Platform for Both Predictive and Generative AI: sched.co/2EF54
- 5. Explore TAG Workloads Foundation: Advancing Cloud Native Execution From Core Runtime To Applications: sched.co/2EF7F A full list of my upcoming and past sessions can be found here: github.com/terrytangyuan/publi…
- If you see me around the hallway or at the sessions, I’d love to chat about: - Model inference (KServe, vLLM, @llm-d.ai) - @kubernetes.io AI Conformance Program - @kubefloworg.bsky.social & @argoproj.bsky.social - @cncf.io TAG Workloads Foundation - Open source, cloud-native, AI infra and systems
- 📢 𝗧𝗵𝗲 𝗦𝘁𝗮𝘁𝗲 𝗼𝗳 𝗠𝗼𝗱𝗲𝗹 𝗦𝗲𝗿𝘃𝗶𝗻𝗴 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝗶𝗲𝘀: 𝗝𝗮𝗻𝘂𝗮𝗿𝘆 𝗘𝗱𝗶𝘁𝗶𝗼𝗻 𝗶𝘀 𝗼𝘂𝘁! We launched our newsletter publicly last year to share our contributions to upstream communities from our Red Hat AI teams. We’ve gained over 𝟭𝟮𝟬𝟬 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲𝗿𝘀!
- Our goal with this newsletter is to give a clear, community-driven view of what’s happening across the model serving ecosystem, including updates from vLLM, KServe, llm-d, @kubernetes.io, and Llama Stack.
- 👉 Check out the January newsletter here: inferenceops.substack.com/p/state-of-the-mode… 👉 Subscribe to get future issues in your inbox: inferenceops.substack.com 🚀 Thanks to everyone who subscribed so far! Kudos to all contributors to this edition!
- Nir Rozenbaum, Sasa Zelenovic, Pete Cheslock, Wentao Ye, Yuan Tang
- 🚀 𝗠𝗼𝗿𝗲 𝗽𝗼𝘀𝗶𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝗼𝗽𝗲𝗻 𝗮𝘁 𝗥𝗲𝗱 𝗛𝗮𝘁 𝗔𝗜! Our 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 team continues to grow with ML engineer, researcher, and developer advocate positions!
- We’re looking for passionate candidates to help us push the boundaries of 𝗔𝗜/𝗟𝗟𝗠 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 and contribute directly to open source projects like vLLM and llm-d. 📩 Please check out these job postings and 𝗲𝗺𝗮𝗶𝗹 𝗺𝗲 (address in my profile) with a short summary of your background + your resume.
- linkedin.com/in/terrytangyuan/ 1. Principal Machine Learning Engineer, AI Inference: redhat.wd5.myworkdayjobs.com/jobs/job/Boston/Pri…
- View full thread
- 🎉 𝗠𝗶𝗹𝗲𝘀𝘁𝗼𝗻𝗲 𝘂𝗻𝗹𝗼𝗰𝗸𝗲𝗱! The 𝘐𝘯𝘧𝘦𝘳𝘦𝘯𝘤𝘦𝘖𝘱𝘴: 𝘚𝘵𝘢𝘵𝘦 𝘰𝘧 𝘵𝘩𝘦 𝘔𝘰𝘥𝘦𝘭 𝘚𝘦𝘳𝘷𝘪𝘯𝘨 𝘊𝘰𝘮𝘮𝘶𝘯𝘪𝘵𝘪𝘦𝘴 newsletter from Red Hat AI just reached 𝟭,𝟬𝟬𝟬 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲𝗿𝘀 in only 5 months!
- A huge thank-you to everyone who’s subscribed, shared, and supported this effort. If you’re curious about what our teams are building across the model serving open source ecosystem, including updates from vLLM, llm-d, KServe, and more, check it out! 🔗 inferenceops.substack.com
- 🚀 𝗪𝗲’𝗿𝗲 𝗛𝗶𝗿𝗶𝗻𝗴 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 𝗶𝗻 𝟮𝟬𝟮𝟲! Our team at 𝗥𝗲𝗱 𝗛𝗮𝘁 𝗔𝗜 continues to grow, and we’re looking for passionate engineers (at multiple levels) to help us push the boundaries of 𝗔𝗜 𝗶𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲.
- We’re especially excited to meet folks with experience in: We’re especially excited to meet folks with experience in: Golang, Rust, C++, Python, Kubernetes, distributed systems, and 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲. If building next-generation distributed AI systems excites you, we’d love to hear from you!
- 📩 Please 𝗲𝗺𝗮𝗶𝗹 𝗺𝗲 (address in my profile) with a short summary of your background + your resume. linkedin.com/in/terrytangyuan/ 👉 Not looking right now? Know someone who’d be a great fit?
- View full thread
- 🎉 Feeling thankful today and reflecting on two incredible years at Red Hat! While my official two-year anniversary is just a few days away, Thanksgiving feels like the perfect time to express my gratitude for this journey. Time flies! Looking forward to many more years to come!
- See my full post on LinkedIn: www.linkedin.com/posts/terryt...
- View my verified achievement from @linuxfoundation.org. www.credly.com/badges/e3fc4... via @credly
- 𝗦𝗹𝗶𝗱𝗲𝘀 𝗳𝗼𝗿 𝗼𝘂𝗿 𝗖𝗹𝗼𝘂𝗱 𝗡𝗮𝘁𝗶𝘃𝗲 𝗔𝗜 𝗗𝗮𝘆 𝘀𝗲𝘀𝘀𝗶𝗼𝗻 𝗳𝗿𝗼𝗺 #𝗞𝘂𝗯𝗲𝗖𝗼𝗻! I paired up with Dan Sun, co-founder of KServe, on a session titled 𝘒𝘚𝘦𝘳𝘷𝘦 𝘕𝘦𝘹𝘵: 𝘈𝘥𝘷𝘢𝘯𝘤𝘪𝘯𝘨 𝘎𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘷𝘦 𝘈𝘐 𝘔𝘰𝘥𝘦𝘭 𝘚𝘦𝘳𝘷𝘪𝘯𝘨.
- It was great to talk about: 🔹Our journey to @cncf.io and the wide adoption of the project 🔹Challenges of GenAI inference and KServe's features that help address them, including: 🔹Metrics-based autoscaler via KEDA 🔹Rate limiting based on token usage via Envoy AI Gateway
- 🔹The new llm-d integration that provides intelligent scheduler, P/D disaggregated serving, and prefix caching. Links: 🔹Slides: github.com/terrytangyuan/publi…
- 🔹CNCF announcement: cncf.io/blog/2025/11/11/kse… 🔹Red Hat announcement: redhat.com/en/blog/kserve-join…
- Finally got my KubeCon professional headshot refreshed! 📸 I’ve been using the one I took three years ago at #KubeCon. It was definitely time for an update.
- Huge thanks to the CNCF Events team for offering this opportunity every year and to the amazing photographers who capture so many great moments from the conference.
- 🎤 𝗠𝗼𝗿𝗲 𝘀𝗹𝗶𝗱𝗲𝘀 𝗮𝗿𝗲 𝗮𝘃𝗮𝗶𝗹𝗮𝗯𝗹𝗲! At #KubeCon North America 2025 in Atlanta, I had the pleasure of joining Stephen Rust, Rajas Kakodkar, and Alex Scammon for our session: 𝘐𝘯𝘵𝘳𝘰𝘥𝘶𝘤𝘪𝘯𝘨 𝘛𝘈𝘎 𝘞𝘰𝘳𝘬𝘭𝘰𝘢𝘥𝘴 𝘍𝘰𝘶𝘯𝘥𝘢𝘵𝘪𝘰𝘯: 𝘈𝘥𝘷𝘢𝘯𝘤𝘪𝘯𝘨 𝘵𝘩𝘦 𝘊𝘰𝘳𝘦
- 𝘰𝘧 𝘊𝘭𝘰𝘶𝘥 𝘕𝘢𝘵𝘪𝘷𝘦 𝘌𝘹𝘦𝘤𝘶𝘵𝘪𝘰𝘯 The @cncf.io 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗔𝗱𝘃𝗶𝘀𝗼𝗿𝘆 𝗚𝗿𝗼𝘂𝗽𝘀 (TAGs) play a vital role in shaping the future of cloud native. We’re excited to introduce a new addition: the 𝗧𝗔𝗚 𝗪𝗼𝗿𝗸𝗹𝗼𝗮𝗱𝘀 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻.
- We presented the mission, scope, and early initiatives of TAG Workloads Foundation, focused on defining and advancing practices and standards for cloud native workload execution environments and lifecycle management.
-
View full threadJoin us to help shape the next phase of cloud native maturity, from fundamental runtime environments to future-forward workload patterns.
- Many of us from 𝗥𝗲𝗱 𝗛𝗮𝘁 𝗔𝗜 attended #KubeCon North America in Atlanta last week. We’d like to share some highlights from the Red Hat AI model serving perspective, including llm-d, KServe, vLLM, and @kubernetes.io. inferenceops.substack.com/p/kubecon-no...
- Hope you enjoy this special issue of the 𝘚𝘵𝘢𝘵𝘦 𝘰𝘧 𝘵𝘩𝘦 𝘔𝘰𝘥𝘦𝘭 𝘚𝘦𝘳𝘷𝘪𝘯𝘨 𝘊𝘰𝘮𝘮𝘶𝘯𝘪𝘵𝘪𝘦𝘴 newsletter! Help us 𝗿𝗲𝗮𝗰𝗵 𝟭,𝟬𝟬𝟬 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲𝗿𝘀!
- 🎤 𝗦𝗹𝗶𝗱𝗲𝘀 𝗻𝗼𝘄 𝗮𝘃𝗮𝗶𝗹𝗮𝗯𝗹𝗲! At #KubeCon North America 2025 in Atlanta, I had the pleasure of joining @ritazh.bsky.social, Jiaxin Shan, and Sergey Kanzhelev for our session: 𝘕𝘢𝘷𝘪𝘨𝘢𝘵𝘪𝘯𝘨 𝘵𝘩𝘦 𝘙𝘢𝘱𝘪𝘥 𝘌𝘷𝘰𝘭𝘶𝘵𝘪𝘰𝘯 𝘰𝘧 𝘓𝘢𝘳𝘨𝘦 𝘔𝘰𝘥𝘦𝘭 𝘐𝘯𝘧𝘦𝘳𝘦𝘯𝘤𝘦: 𝘞𝘩𝘦𝘳𝘦 𝘋𝘰𝘦𝘴 𝘒𝘶𝘣𝘦𝘳𝘯𝘦𝘵𝘦𝘴 𝘍𝘪𝘵?
- We covered: 1. The evolving landscape of AI-related Working Groups in the @kubernetes.io community. 2. Emerging challenges from fast-moving inference architectures, heterogeneous hardware, distributed KV cache management, fault tolerance and recovery strategies for large-scale model inference.
- 3. Key initiatives from K8s WG Serving and relevant discussions in WG AI Conformance. 4. Solution showcases and reference architectures including LWS, DRA, AIBrix, llm-d, KServe, Dynamo, and KAITO. 📑 Slides (link to recording will also be available soon): github.com/terrytangyua...
- The discussion highlighted how the Kubernetes ecosystem is quickly adapting to support 𝗻𝗲𝘅𝘁-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗔𝗜 𝗮𝗻𝗱 𝗟𝗟𝗠 𝘄𝗼𝗿𝗸𝗹𝗼𝗮𝗱𝘀. This is only the beginning!