Two Undergrads Unveil SOTA Open-Source Speech AI
Two Undergrads Unveil SOTA Open-Source Speech AI Published: May 2025 Image source: Nari Labs The Rundown Korean startup Nari Labs , founded by two undergraduates with no outside funding, has released Dia —a 1.6 billion-parameter , open-source text-to-speech model that rivals leading commercial systems like ElevenLabs and Sesame CSM-1B. 👉 Try Dia on GitHub: Nari Labs Dia Model Key Features of Dia Expressive Emotional Tones Delivers nuanced speech with joy, sadness, and urgency. Multi-Speaker Support Tag voices for distinct characters or personas. Nonverbal Cues Includes realistic laughter, coughing, whispers, and screams. Open-Source & Free No licensing fees—ideal for startups, researchers, and hobbyists. Performance Benchmarks In side-by-side tests, Dia outperformed: ElevenLabs Studio in waveform naturalness and timing ** Sesame CSM-1B Latency & Throughput in large-batch generation How Two Undergrads Did It TPU Research Cloud: L...