Workshop on Multilingual Data Quality Signals
The first iteration of our workshop will be co-located with @colmweb.org 2025 in Montreal.
wmdqs.org
- WMDQS is underway! Come join us in Room 520A at @colmweb.org! #COLM2025
- We started with a keynote from @juliakreutzer.bsky.social about multilingual fine-tuning data!
- We presented the results of our shared task! We received annotations for over 30,000 document representing over 60 languages. We also showed the results of our LangID dataset and system shared task tracks. Thank you everyone who participated!
-
View full threadIf you were able to join us, let us know about your experience: docs.google.com/forms/d/e/1F...
- Reposted by Workshop on Multilingual Data Quality SignalsLooking forward to tomorrow's #COLM2025 workshop on multilingual data quality! 🤩
- In collaboration with @commoncrawl.bsky.social, MLCommons, and @eleutherai.bsky.social, the first edition of WMDQS at @colmweb.org starts tomorrow in Room 520A! We have an updated schedule on our website, including a list of all accepted papers.
- In collaboration with @commoncrawl.bsky.social, MLCommons, and @eleutherai.bsky.social, the first edition of WMDQS at @colmweb.org starts tomorrow in Room 520A! We have an updated schedule on our website, including a list of all accepted papers.
- Our first keynote will be from @juliakreutzer.bsky.social about data for multilingual fine-tuning.
- Our second keynote will be by David Adelani about text quality for low-resource languages.
-
View full threadSee our updated website for more details: wmdqs.org
- Reposted by Workshop on Multilingual Data Quality Signals[Not loaded yet]
- Reposted by Workshop on Multilingual Data Quality SignalsThe Common Crawl Foundation, MLCommons, EleutherAI, and John Hopkins' Center for Language and Speech Processing have the pleasure of inviting you to register for the 1st shared task on Language Identification for web data. commoncrawl.org/blog/wmdqs-s...
- We've added lots more documents/languages and extended the deadline for the first round of annotations until July 23rd. Check out the details below 👇
- For context: bsky.app/profile/cath...
- Contribute annotations here: dynabench.org/tasks/text-l...