We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.
The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ...
Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.
Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.
As AI continues to shape the way we interact with the digital world, the demand for inclusive and diverse language technologies is greater than ever. The Multilingual Speech Language Model Challenge (MLC-SLM), hosted by Nexdata and organized as part of the ICML 2025 Competition Track, addresses this need by encouraging the development of advanced automatic speech recognition (ASR) systems across a wide range of languages and dialects.
This global competition invites researchers, developers, and AI practitioners to build and fine-tune large-scale multilingual speech language models capable of accurately transcribing speech in multiple underrepresented languages. With more than 70 hours of transcribed data provided per language and over 300 hours of publicly available test sets, the challenge creates a unique opportunity to explore the frontiers of multilingual model training, transfer learning, and bias mitigation.
Participants will compete in two tracks: Full Dataset and Low-Resource, pushing the limits of performance in both high-data and constrained environments. This structure reflects real-world challenges where access to linguistic data can vary significantly between languages. The aim is to develop models that are not only accurate and robust but also scalable and equitable.
Beyond the competition itself, MLC-SLM is a call to action for the research community to build technologies that support linguistic diversity, digital inclusion, and cultural preservation. By participating in this challenge, teams will contribute to a broader mission of democratizing AI for all languages, paving the way for more inclusive global communication systems.
Explore more here.