A Cross-Modal Variational Framework for Food Image Analysis

Authors
T. Theodoridis
V. Solachidis
K. Dimitropoulos
P. Daras
Year
2020
Venue
in International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, October 25-28, 2020.
Download

Abstract

Food analysis resides at the core of modern nutrition recommender systems, providing the foundation for a high-level understanding of users' eating habits. This paper focuses on the sub-task of ingredient recognition from food images using a variational framework. The framework consists of two variational encoder-decoder branches, aimed at processing information from different modalities (images and text), as well as a variational mapper branch, which accomplishes the task of aligning the distributions of the individual branches. Experimental results on the Yummly-28K data-set showcase that the proposed framework performs better than similar variational frameworks, while it surpasses current state-of-the-art approaches on the large-scale Recipe1M data-set.