Abstract: Whisper is a powerful automatic speech recognition (ASR) model. Nevertheless, its zero-shot performance on low-resource speech requires further improvement. Child speech, as a representative ...
Abstract: Automatic Speech Recognition (ASR) systems have improved and eased how humans interact with devices. ASR system converts an acoustic waveform into the relevant text form. Modern ASR ...
A simple yet powerful Laravel package for integrating Microsoft Edge Text-to-Speech (TTS) into your applications. It features audio streaming, caching, abstraction, and security controls. This package ...
This repo provides a command-line tool for performing automatic speech-to-text tasks (i.e., "transcription") using open source models from Hugging Face Hub. For interactive tasks, it allows users to ...