Multi-modal Variational Faster R-CNN for Improved Visual Object Detection in Manufacturing

Authors	P. Mouzenidis
	A. Louros
	D. Konstantinidis
	K. Dimitropoulos
	P. Daras
	T. Mastos
Year	2021
Venue	in IEEE/CVF International Conference on Computer Vision Workshops, October 11-17, 2021.
Download

Abstract

Visual object detection is a critical task for a variety of industrial applications, such as robot navigation, quality control and product assembling. Modern industrial environments require AI-based object detection methods that can achieve high accuracy, robustness and generalization. To this end, we propose a novel object detection approach that can process and fuse information from RGB-D images for the accurate detection of industrial objects. The proposed approach utilizes a novel Variational Faster R-CNN algorithm that aims to improve the robustness and generalization ability of the original Faster R-CNN algorithm by employing a VAE encoder-decoder network and a very powerful attention layer. Experimental results on two object detection datasets, namely the well-known RGB-D Washington dataset and the QCONPASS dataset of industrial objects that is first presented in this paper, verify the significant performance improvement achieved when the proposed approach is employed.