Multi-modal Variational Faster R-CNN for Improved Visual Object Detection in Manufacturing


Visual object detection is a critical task for a variety of industrial applications, such as robot navigation, quality control and product assembling. Modern industrial environments require AI-based object detection methods that can achieve high accuracy, robustness and generalization. To this end, we propose a novel object detection approach that can process and fuse information from RGB-D images for the accurate detection of industrial objects. The proposed approach utilizes a novel Variational Faster R-CNN algorithm that aims to improve the robustness and generalization ability of the original Faster R-CNN algorithm by employing a VAE encoder-decoder network and a very powerful attention layer. Experimental results on two object detection datasets, namely the well-known RGB-D Washington dataset and the QCONPASS dataset of industrial objects that is first presented in this paper, verify the significant performance improvement achieved when the proposed approach is employed.

  • P. Mouzenidis, A. Louros, D. Konstantinidis, K. Dimitropoulos, P Daras, T. Mastos, "Multi-modal Variational Faster R-CNN for Improved Visual Object Detection in Manufacturing", in IEEE/CVF International Conference on Computer Vision Workshops, October 11-17, 2021.

  • Full document available here.
    Contact Information

    Dr. Petros Daras, Research Director
    6th km Charilaou – Thermi Rd, 57001, Thessaloniki, Greece
    P.O.Box: 60361
    Tel.: +30 2310 464160 (ext. 156)
    Fax: +30 2310 464164
    Email: daras(at)iti(dot)gr