Workshop on Multilingual Data Quality Signals
The first iteration of our workshop will be co-located with @colmweb.org 2025 in Montreal.
wmdqs.org
- WMDQS is underway! Come join us in Room 520A at @colmweb.org! #COLM2025
- We started with a keynote from @juliakreutzer.bsky.social about multilingual fine-tuning data!
- Reposted by Workshop on Multilingual Data Quality SignalsLooking forward to tomorrow's #COLM2025 workshop on multilingual data quality! 🤩
- In collaboration with @commoncrawl.bsky.social, MLCommons, and @eleutherai.bsky.social, the first edition of WMDQS at @colmweb.org starts tomorrow in Room 520A! We have an updated schedule on our website, including a list of all accepted papers.
- In collaboration with @commoncrawl.bsky.social, MLCommons, and @eleutherai.bsky.social, the first edition of WMDQS at @colmweb.org starts tomorrow in Room 520A! We have an updated schedule on our website, including a list of all accepted papers.
- Our first keynote will be from @juliakreutzer.bsky.social about data for multilingual fine-tuning.
- Reposted by Workshop on Multilingual Data Quality SignalsIf you want to help us improve language and cultural coverage, and build an open source LangID system, please register to our shared task on Language Identification! 💬 Registering is easy! All the details are on the shared task webpage: wmdqs.org/shared-task/ Deadline: July 23, 2025 (AoE) ⏰
- The Common Crawl Foundation, MLCommons, EleutherAI, and John Hopkins' Center for Language and Speech Processing have the pleasure of inviting you to register for the 1st shared task on Language Identification for web data. commoncrawl.org/blog/wmdqs-s...
- Reposted by Workshop on Multilingual Data Quality SignalsThe Common Crawl Foundation, MLCommons, EleutherAI, and John Hopkins' Center for Language and Speech Processing have the pleasure of inviting you to register for the 1st shared task on Language Identification for web data. commoncrawl.org/blog/wmdqs-s...
- We've added lots more documents/languages and extended the deadline for the first round of annotations until July 23rd. Check out the details below 👇