See full post

Marzieh Fadaee

Followers · Following

seeks to understand language. Head of Cohere Labs @Cohere_Labs @Cohere PhD from @UvA_Amsterdam marziehf.github.io

Joined September 2024

Posts Replies Media Original posts Likes

Reposted by Marzieh Fadaee
Cohere Labs cohereforai.bsky.social · Sep 29, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Marzieh Fadaee
Cohere Labs cohereforai.bsky.social · Sep 30, 2025
We’re not your average lab. We’re a hybrid research environment dedicated to revolutionizing the ML space. And we’re hiring a Senior Research Scientist to co-create with us. If you believe in research as a shared, global effort — this is your chance.

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Sep 5, 2025
I'm excited to share that I'll be stepping into the role of Head of @cohereforai.bsky.social. It's an honor and a responsibility to lead such an extraordinary group of researchers pushing the boundaries of AI research.

View on Bluesky Download image Show all post labels

Reposted by Marzieh Fadaee
Cohere Labs cohereforai.bsky.social · Aug 15, 2025
While effective for chess♟️, Elo ratings struggle with LLM evaluation due to volatility and transitivity issues. New post in collaboration with AI Singapore explores why Elo falls short for AI leaderboards and how we can do better.

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Aug 13, 2025
Breaking into AI research is harder than ever, and early-career researchers face fewer chances to get started. Entry points matter. We started the Scholars Program 3 years ago to give new researchers a real shot — excited to open applications for year 4✨
- Cohere Labs cohereforai.bsky.social · Aug 13, 2025
  [Not loaded yet]
View on Bluesky Show all post labels
Marzieh Fadaee handle.invalid · Aug 13, 2025
Over the years, I've watched scholars go from their very first project → to their first paper → to research careers they once thought were out of reach. It’s been incredible to see what can happen when someone gets their first real chance and works hard to make it count 🏅

View on Bluesky Show all post labels

Reposted by Marzieh Fadaee
Coalition on Digital Impact (CODI) codi.global · Aug 7, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Marzieh Fadaee handle.invalid · Jul 29, 2025
ACL day 2 ✨

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Jul 9, 2025
🖼️ Most text-to-image models only really work in English. This limits who can use them and whose imagination they reflect. We asked: can we build a small, efficient model that understands prompts in multiple languages natively?

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Jun 29, 2025
Everyone talks about GEB (I agree, it's a gem) but Hofstadter's Analogy book is criminally underrated. If you're working on learning intelligence through language understanding, it’s a must-read.

View on Bluesky Download image Show all post labels

Reposted by Marzieh Fadaee
Julia Kreutzer juliakreutzer.bsky.social · Jun 26, 2025
🍋 Squeezing the most of few samples - check out our LLMonade recipe for few-sample test-time scaling in multitask environments. Turns out that standard methods miss out on gains on non-English languages. We propose more robust alternatives. Very proud of this work that our scholar Ammar led! 🚀
- Cohere Labs cohereforai.bsky.social · Jun 26, 2025
  Can we improve the performance of LLMs during inference without the need for extensive sampling OR special reward models? 🤔 Our latest work introduces a new inference time scaling recipe that is sample-efficient, multilingual, and suitable for multi-task requirements. 🍋
View on Bluesky Show all post labels

Marzieh Fadaee handle.invalid · Jun 9, 2025
London has me under its spell. every. single. visit.

View on Bluesky Download image Show all post labels

Reposted by Marzieh Fadaee
Julia Kreutzer juliakreutzer.bsky.social · Jun 4, 2025
🚨LLM safety research needs to be at least as multilingual as our models. What's the current stage and how to progress from here? This work led by @yongzx.bsky.social has answers! 👇
- Cohere Labs cohereforai.bsky.social · Jun 3, 2025
  It’s been two years since cross-lingual jailbreaks were first discovered. How far has the multilingual LLM safety research field advanced? 🤔 📏 Our comprehensive survey reveals that there is still a long way to go.
View on Bluesky Show all post labels

Reposted by Marzieh Fadaee
Cohere Labs cohereforai.bsky.social · May 28, 2025
Over 7000 languages are spoken worldwide 🌐, but AI safety efforts focus on only a fraction of them. Our latest paper draws on our multi-year efforts with the wider research community to explore why this matters and how we can bridge the AI language gap.

View on Bluesky Download image Show all post labels

Reposted by Marzieh Fadaee
Anna Rogers handle.invalid · May 26, 2025
📢 The Copenhagen NLP Symposium on June 20th! - Invited talks by @loubnabnl.hf.co (HF) @mziizm.bsky.social (Cohere) @najoung.bsky.social (BU) @kylelo.bsky.social (AI2) Yohei Oseki (UTokyo) - Exciting posters by other participants Register to attend and/or present your poster at cphnlp.github.io /1
Copenhagen NLP Symposium 2025

symposium website

cphnlp.github.io

View on Bluesky Show all post labels

Reposted by Marzieh Fadaee
Sara Hooker sarahooker.bsky.social · Apr 30, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Marzieh Fadaee
David Pfau davidpfau.com · Apr 30, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Marzieh Fadaee handle.invalid · Apr 30, 2025
1/ Science is only as strong as the benchmarks it relies on. So how fair—and scientifically rigorous—is today’s most widely used evaluation benchmark? We took a deep dive into Chatbot Arena to find out. 🧵

View on Bluesky Download image Show all post labels
Marzieh Fadaee handle.invalid · Apr 30, 2025
2/ 🧪 With theory, simulations, and real-world experiments, we stress-test Arena’s fairness and found: - Undisclosed private model testing warps results - Silent model deprecation undermines rank stability - Data access disparities between providers that enable overfitting

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Apr 22, 2025
Not in Singapore for #ICLR2025 but our lab’s work is! In particular, I am very proud of these collaborations: ✨INCLUDE (spotlight) — models fail to grasp regional nuances across languages 💎To Code or Not to Code (poster) — code is key for generalizing beyond coding tasks

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Apr 17, 2025
🚨 Excited to share our latest paper! Multilingual LLMs are getting really good. But the way we evaluate them? Not the best sometimes. 🌟 We show how decades of lessons from Machine Translation can help us fix it
- Julia Kreutzer juliakreutzer.bsky.social · Apr 17, 2025
  📖New preprint with Eleftheria Briakou @swetaagrawal.bsky.social @mziizm.bsky.social @kocmitom.bsky.social! arxiv.org/abs/2504.11829 🌍It reflects experiences from my personal research journey: coming from MT into multilingual LLM research I missed reliable evaluations and evaluation research…
View on Bluesky Show all post labels

Marzieh Fadaee handle.invalid · Apr 10, 2025
Very excited to release Kaleidoscope—a multilingual, multimodal evaluation set for VLMs, built as part of our open-science initiative! 🌍 18 languages (high-, mid-, low-) 📚 21k questions (55% require image understanding) 🧪 STEM, social science, reasoning, and practical skills

View on Bluesky Download image Show all post labels

Reposted by Marzieh Fadaee
Tom Kocmi handle.invalid · Mar 11, 2025
Big news from WMT! 🎉 We are expanding beyond MT and launching a new multilingual instruction shared task. Our goal is to foster truly multilingual LLM evaluation and best practices in automatic and human evaluation. Join us and build the winning multilingual system! www2.statmt.org/wmt25/multil...
Multilingual Instruction Shared Task

www2.statmt.org

View on Bluesky Show all post labels

Reposted by Marzieh Fadaee
Tom Kocmi handle.invalid · Mar 28, 2025
☀️ Summer internship at Cohere! Are you excited about multilingual evaluation, human judgment, or meta-eval? Come help us explore how a rigorous eval really looks like while questioning the status quo in LLM evaluation. I’m looking for an intern (EU timezone preferred), are you interested? Ping me!

View on Bluesky Show all post labels

Marzieh Fadaee handle.invalid · Mar 27, 2025
Command🅰️ technical report is out. Information-dense. Detailed. Pretty. Simply A+! 💎: cohere.com/research/pap...
Command A: An Enterprise-Ready Family of Large Language Models

In this report we describe the development of Command A, a powerful large language model purpose-built to excel at real-world enterprise use cases. Command

cohere.com
- Max Bartolo maxbartolo.bsky.social · Mar 27, 2025
  I'm excited to share the tech report for our @cohere.com @cohereforai.bsky.social Command A and Command R7B models. We highlight our novel approach to model training including self-refinement algorithms and model merging techniques at scale. Read more below! ⬇️
View on Bluesky Show all post labels
Marzieh Fadaee handle.invalid · Mar 27, 2025
I am so proud of Cohere's dedication to open science and its impact on the community! ✨

View on Bluesky Show all post labels

Marzieh Fadaee handle.invalid · Mar 12, 2025
Good morning Paris

View on Bluesky Download image Show all post labels

Marzieh Fadaee handle.invalid · Mar 4, 2025
✨👓 Aya Vision is here 👓✨ A multilingual, multimodal model designed to understand across languages and modalities (text, images, etc) to bridge the language gap and empower global users!

View on Bluesky Download image (1)Download image (2)Show all post labels

An unhandled error has occurred. Reload 🗙