AI Magazine October 2025 | Page 217

AI APPLICATIONS
Jensen Huang, CEO, Nvidia

Nvidia’ s Granary is an open-source speech dataset packed with one million hours of multilingual audio spanning 25 European languages.

It’ s specifically targeting those languages that usually get left behind – such as Croatian, Estonian and Maltese – where data scarcity has been a real headache for virtual assistant developers and speech AI applications.
Nearly 650,000 hours earmarked for automatic speech recognition and over 350,000 hours for translation tasks.
Working with Carnegie Mellon University and Italy’ s Fondazione Bruno Kessler, Nvidia uses its NeMo toolkit to transform unlabelled audio into proper training data without the usual tedious manual annotation process. Nvidia has also released two accompanying models. Canary-1b-v2 packs a billion parameters
for accuracy-focused work, while Parakeet-tdt-0.6b-v3 uses 600 million parameters but prioritises speed for real-time applications – perfect for virtual assistant interactions. Both are currently topping Hugging Face’ s leaderboards.
Perhaps most impressively, research shows Granary needs roughly half the training data of existing datasets to hit the same accuracy levels.
As Nvidia CEO Jensen Huang says:“ General-purpose, open-source research and foundation models are the backbone of AI innovation.”
It’ s targeting production-scale applications including multilingual virtual assistants and customer service voice agents – a big step towards addressing the fact that fewer than 100 of the world’ s roughly 7,000 languages get decent AI support. aimagazine. com 217