A fast, modular scene understanding system using context-aware object detection

We propose a semantic scene understanding system that is suitable for real robotic operations. The system solves different tasks (semantic segmentation and object detections) in an opportunistic and distributed fashion but still allows communication between modules to improve their respective performances.

We propose the use of the semantic space to improve specific out-of-the-box objectdetectors and an update model to take the evidence from different detection into account in the semantic segmentation process. Our proposal is evaluated with the KITTI dataset, on the objectdetection benchmark and on five different sequences manually annotated for the semantic segmentation task, demonstrating the efficacy of our approach.