Two Undergrads Build Dia AI Speech Model to Rival Commercial Giants
24-Apr-2025
TechCrunch reported that Korean startup Nari Labs has released Dia, an open-source text-to-speech AI model that challenges top-tier solutions like ElevenLabs and Sesame. Remarkably, Dia was developed by just two undergraduate students with zero funding, a feat drawing attention across the tech world. The 1.6B parameter model boasts features such as emotional intonation, nonverbal cue support (e.g., laughter, coughs), and multiple speaker tags. Inspired by Google's NotebookLM, the project utilized Google's TPU Research Cloud for computing resources. Benchmarks and side-by-side demos show Dia surpassing ElevenLabs Studio and Sesame CSM-1B in expressiveness and timing accuracy. Nari Labs plans to develop a consumer-facing app to enable content creators to remix and generate personalized speech content. This underdog story has gained traction due to its bold goal and striking performance, with the founder sharing the announcement on X.