Graph-based Multimodal Fusion with Metric Learning for Multimodal Classification


In this paper, a graph-based, supervised classification method for multimodal data is introduced. It can be applied on data of any type consisting of any number of modalities and can also be used for the classification of datasets with missing modalities. The proposed method maps the features extracted from every modality to a space where the intrinsic structure of the multimodal data is kept. In order to map the extracted features of the different modalities into the same space and, at the same time, maintain the feature distances between similar and dissimilar modality data instances, a metric learning method is used. The proposed method has been evaluated on NUS-Wide, NTU-RGBD and AV-Letters multimodal datasets and has shown competitive results with the state-of-the-art methods in the field, while is able to cope with datasets with missing modalities.

  • M. Angelou, V. Solachidis, N. Vretos, P. Daras, "Graph-based Multimodal Fusion with Metric Learning for Multimodal Classification", Pattern Recognition, 95, 296-307, 2019. DOI:

  • Full document available here.
    Contact Information

    Dr. Petros Daras, Research Director
    6th km Charilaou – Thermi Rd, 57001, Thessaloniki, Greece
    P.O.Box: 60361
    Tel.: +30 2310 464160 (ext. 156)
    Fax: +30 2310 464164
    Email: daras(at)iti(dot)gr