Advances in Facial Expression Recognition: A Survey ofMethods, Benchmarks, Models, and Datasets

T. Kopalidis
V. Solachidis
N. Vretos
P. Daras
Information 2024, 15, 135


Recent technological developments have enabled computers to identify and categorizefacial expressions to determine a person’s emotional state in an image or a video. This process,called “FacialExpressionRecognition (FER)”, has become one of the most popular research areasin computer vision. In recent times, deep FER systems have primarily concentrated on addressingtwo significant challenges: the problem of overfitting due to limited training data availability, andthe presence of expression-unrelated variations, including illumination, head pose, image resolution,and identity bias. In this paper, a comprehensive survey is provided on deep FER, encompassingalgorithms and datasets that offer insights into these intrinsic problems. Initially, this paper presentsa detailed timeline showcasing the evolution of methods and datasets in deep facial expressionrecognition (FER). This timeline illustrates the progression and development of the techniques anddata resources used in FER. Then, a comprehensive review of FER methods is introduced, includingthe basic principles of FER (components such as preprocessing, feature extraction and classification,and methods, etc.) from the pro-deep learning era (traditional methods using handcrafted features,i.e., SVM and HOG, etc.) to the deep learning era. Moreover, a brief introduction is providedrelated to the benchmark datasets (there are two categories: controlled environments (lab) anduncontrolled environments (in the wild)) used to evaluate different FER methods and a comparisonof different FER models. Existing deep neural networks and related training strategies designed forFER, based on static images and dynamic image sequences, are discussed. The remaining challengesand corresponding opportunities in FER and the future directions for designing robust deep FERsystems are also pinpointed.