Multi-modal Variational Faster R-CNN for Improved Visual Object Detection in Manufacturing

P. Mouzenidis
A. Louros
D. Konstantinidis
K. Dimitropoulos
P. Daras
T. Mastos
in IEEE/CVF International Conference on Computer Vision Workshops, October 11-17, 2021.


Visual object detection is a critical task for a variety of industrial applications, such as robot navigation, quality control and product assembling. Modern industrial environments require AI-based object detection methods that can achieve high accuracy, robustness and generalization. To this end, we propose a novel object detection approach that can process and fuse information from RGB-D images for the accurate detection of industrial objects. The proposed approach utilizes a novel Variational Faster R-CNN algorithm that aims to improve the robustness and generalization ability of the original Faster R-CNN algorithm by employing a VAE encoder-decoder network and a very powerful attention layer. Experimental results on two object detection datasets, namely the well-known RGB-D Washington dataset and the QCONPASS dataset of industrial objects that is first presented in this paper, verify the significant performance improvement achieved when the proposed approach is employed.