News

With a focus on expressive quality, reproducibility, and open access, Dia adds a distinctive new voice to the landscape of text-to-speech.
Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.
Image Credits:ElevenLabs ElevenLabs had developed the speech-to-text component for its AI conversational agent platform, which was released last year.
OpenAI Whisper is an automatic speech recognition (ASR) system. It’s designed to convert spoken language into text. Whisper was trained on a diverse range of internet audio, which includes ...
AI text-to-speech programs could “unlearn” how to imitate certain people New research shows models can be directly edited to hide selected voices, even when users specifically ask for them.
Text-to-speech with feeling - this new AI model does everything but shed a tear ElevenLabs' 'most expressive' v3 model can speak with a huge range of emotions in more than 70 languages.
Researchers at Amazon have trained the largest ever text-to-speech model yet, which they claim exhibits “emergent” qualities improving its ability to speak even complex sentences naturally ...
Researchers study the importance of enunciation when using speech-to-text software in medical situations. Speech-to-text programs are becoming more popular for everyday tasks like hands-free ...