SMRTR AI• Jun 15, 2026• Hacker News

Beyond Transcription: ASR Model Delivers Words, Emotion, and Intent in 200ms

SMRTR summary

Whissle's META-1 model goes beyond standard speech recognition by delivering transcription alongside emotion, intent, age, and gender data in a single pass at ~200ms — 9x faster than competing metadata solutions. Tested across 1,300 samples in four languages, adding a KenLM language model reduced word error rates by up to 3.6%, while AssemblyAI completely failed on non-English streaming, exposing a major gap in multilingual ASR reliability.

SMRTR provides this summary for quick context. The original article belongs to Hacker News.

Read the original article

SMRTR AI

Get the next batch of curated summaries in your inbox.

This archive is built from SMRTR newsletter summaries. Subscribe for hand-picked stories without the extra noise.

How To Improve ML Models with Human Labels

AssemblyAI has improved its automatic language detection model, now supporting 17 languages and outperforming major competitors on 16 of them. The company's speech-to-text...

Read SMRTR summary Original

AI• Hacker News• Nov 10, 2025

Omnilingual ASR: Advancing automatic speech recognition for 1600 languages

Meta's Omnilingual ASR recognizes speech in over 1,600 languages using a 7-billion parameter model, achieving under 10% error rates for 78% of languages. Communities can add new...

Read SMRTR summary Original

AI• Wired• Feb 4, 2026

Mistral's New Ultra-Fast Translation Model Gives Big AI Labs a Run for Their Money

Mistral AI launched two ultra-fast speech-to-text models that can translate between 13 languages, with one operating in near real-time within 200 milliseconds. At just 4 billion...

Read SMRTR summary Original

AI• MIT Technology Review• Jan 15, 2025

Meta’s new AI model can translate speech from more than 100 languages

Meta's new AI model, SeamlessM4T, can translate speech between 101 languages with improved accuracy and efficiency. The model uses parallel data mining to associate sounds with...

Read SMRTR summary Original

AI• Daily.dev• Feb 26, 2025

ElevenLabs is launching its own speech-to-text model

ElevenLabs, a $3.3 billion AI startup, launched Scribe, its first stand-alone speech-to-text model supporting over 99 languages. The model outperforms competitors in benchmark...

Read SMRTR summary Original

AI• Daily.dev• Feb 12, 2026

Improved Gemini audio models for powerful voice interactions

Google upgraded Gemini 2.5 Flash Native Audio with better voice interactions and 90% instruction adherence. They also launched live speech translation supporting over 70 languages...

Read SMRTR summary Original

Beyond Transcription: ASR Model Delivers Words, Emotion, and Intent in 200ms

Get the next batch of curated summaries in your inbox.

Related Stories

How To Improve ML Models with Human Labels

Omnilingual ASR: Advancing automatic speech recognition for 1600 languages

Mistral's New Ultra-Fast Translation Model Gives Big AI Labs a Run for Their Money

Meta’s new AI model can translate speech from more than 100 languages

ElevenLabs is launching its own speech-to-text model

Improved Gemini audio models for powerful voice interactions