Abstract: Despite recent advancements in speech processing, zero-resource speech translation (ST) and automatic speech recognition (ASR) remain challenging problems. In this work, we propose to ...
Abstract: Speech Emotion Recognition (SER) technology analyzes speech characteristics in human-computer interactions to understand user intent and improve interaction experience. It is widely used in ...
AudioFingerprint is a production-ready, local audio fingerprinting and song identification system inspired by Shazam and Google Sound Search. It uses spectral peak extraction and combinatorial hashing ...