SesameAILabs/csm: A Conversational Speech Generation Model
SMRTR summary
Sesame has released CSM-1B, a speech generation model that creates audio from text and audio inputs using a Llama backbone and audio decoder. The model can produce various voices but isn't fine-tuned for specific ones, and a demo is available on Hugging Face for testing audio generation capabilities.
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article