A Unified Framework for Multimodal Retrieval


In this paper, a unified framework for multimodal content retrieval is presented. The proposed framework supports retrieval of rich media objects as unified sets of different modalities (image, audio, 3D, video and text), by efficiently combining all monomodal heterogeneous similarities to a global one according to an automatic weighting scheme. Then, a multimodal space is constructed, to capture the semantic correlations among multiple modalities. In contrast to existing techniques, the proposed method is also able to handle external multimodal queries, by embedding them to the already constructed multimodal space, following a space mapping procedure of a submanifold analysis. In our experiments with five real multimodal datasets, we show the superiority of the proposed approach against competitive methods.

  • D. Rafailidis, S. Manolopoulou, P. Daras, "A Unified Framework for Multimodal Retrieval", ELSEVIER, Pattern Recognition, Volume 46, Issue 12, Pages 3358–3370, December 2013

  • Full document available here.
    Contact Information

    Dr. Petros Daras, Research Director
    6th km Charilaou – Thermi Rd, 57001, Thessaloniki, Greece
    P.O.Box: 60361
    Tel.: +30 2310 464160 (ext. 156)
    Fax: +30 2310 464164
    Email: daras(at)iti(dot)gr