In this ELOQUENCE Webinar & WebCafé session, the discussion focused on one of the key challenges for the future of European AI: how to develop multilingual speech technologies that are accurate, trustworthy, inclusive, and useful across different languages and real-world contexts.
The session featured two speakers: Alessio Brutti, Scientific Coordinator of the ELOQUENCE project and Head of the Speech Technology Lab at Fondazione Bruno Kessler, and Martina Valente, Speech Technology Engineer at Almawave, working in the company’s Voice Engineering Unit. Together, they presented both the research and industry perspectives on multilingual AI and its role in supporting digital inclusion.
Alessio opened the session by introducing FBK’s work within ELOQUENCE, particularly in the area of multilingual and bias-controlled language models. He highlighted that multilinguality has become an increasingly important topic in the scientific community. While many language technologies have traditionally focused on high-resource languages such as English, Chinese, French, or Spanish, many European languages still have far less available data. This creates a performance gap and limits access to high-quality speech technologies for many communities.
This is especially relevant for Europe, where linguistic and cultural diversity is one of the region’s defining strengths. However, from a technological point of view, this diversity also presents a challenge. Many European languages, including minority and regional languages, are considered low-resource compared to globally dominant languages. Alessio emphasised that research institutions and publicly funded projects have an important role to play in addressing this gap, especially for languages and communities that may not be a priority for large commercial players.
Within ELOQUENCE, FBK is exploring different approaches to improve speech recognition for low-resource languages. These include transfer learning methods, speech large language model frameworks, and the use of unlabelled data through weak supervision. Such approaches aim to reduce the amount of manually labelled data required and make multilingual speech recognition more scalable and effective.
Martina then brought the industry perspective, presenting Almawave’s work on real-world multilingual speech AI applications. She introduced Almawave’s voice technologies, which cover automatic speech recognition, language identification, speaker-related technologies, and speech analytics across more than 40 languages. She also presented concrete case studies, including the automatic transcription and translation of Tanzanian court trials and multilingual broadcast monitoring for Scandinavian radio channels.
These examples showed how multilingual speech AI can support transparency, accessibility, and communication in practical settings. In the Tanzanian justice system, for instance, speech technologies helped address a multilingual context where court proceedings involve both English and Swahili. In broadcast monitoring, the challenge was to detect and transcribe different languages that appear within the same channel or programme.
The session also addressed trustworthiness, data quality, and ethical concerns. Both speakers stressed that multilingual AI depends heavily on responsible data collection, cleaning, filtering, and processing. Bias, privacy, cultural context, and regulatory compliance are central challenges that must be considered from the beginning.
Ultimately, the Web Café showed that multilingual speech AI is not only a technical challenge, but also a matter of inclusion. By supporting more languages and cultural contexts, projects like ELOQUENCE help ensure that language technologies serve a broader and more diverse range of users.
Listen to the full episode here.
