Récupération multimodale
en construction
Définition
XXXXX
Voir aussi génération texte-à-image
Français
XXXXXX
Anglais
Cross-Modal Retrieval
CMR
Cross-Modal Retrieval (CMR) is a task of retrieving items across different modalities, such as image, text, video, and audio. The core challenge of CMR is the heterogeneity gap, which arises because data from different modalities have distinct representations, making direct comparison difficult. To address this, most CMR methods focus on learning a shared latent embedding space. In this space, concepts from different modalities are projected, allowing their similarity to be measured using a distance metric.