Abstract: Contemporary advancements in Earth observation technologies have generated substantial data resources for remote sensing image retrieval applications. However, existing models exhibit ...
Abstract: Recent CLIP-guided 3D generation methods have achieved promising results but struggle with generating faithful 3D shapes that conform with input text due to the gap between text and image ...
TL;DR: We propose ReAlign, a plug-and-play reward-guided alignment strategy for text-to-motion generation, which explicitly enhances both semantic consistency and motion realism throughout the ...
BrgSA is an open-source framework designed for zero-shot 3D medical image diagnosis and cross-modal retrieval. It introduces Semantic Summarization and Cross-Modal Knowledge Interaction (CMKI) to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results