Abstract: Image-text matching is a crucial area in image processing and AI, focusing on computing the similarity between a natural language sentence and an image to create a unified space for ...