Abstract: This paper addresses the x-to-audio alignment challenge (XACLE) of the ICASSP 2026 Signal Processing Grand Challenges, which aims to develop automatic audio-text relevance assessment methods ...
Abstract: Text-based person retrieval is a cross-modal task that seeks to match pedestrian images with their corresponding textual descriptions. A key challenge in this task arises from the inherent ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results