Electronic International Standard Serial Number (EISSN)
1424-8220
abstract
The growing on-board processing capabilities have led to more complex sensor configurations, enabling autonomous car prototypes to expand their operational scope. Nowadays, the joint use of LiDAR data and multiple cameras is almost a standard and poses new challenges for existing multi-modal perception pipelines, such as dealing with contradictory or redundant detections caused by inference on overlapping images. In this paper, we address this last issue in the context of sequential schemes like F-PointNets, where object candidates are obtained in the image space, and the final 3D bounding box is then inferred from point cloud information. To this end, we propose the inclusion of a re-identification branch into the 2D detector, i.e., Faster R-CNN, so that objects seen from adjacent cameras can be handled before the 3D box estimation takes place, removing duplicates and completing the object’s cloud. Extensive experimental evaluations covering both the 2D and 3D domains affirm the effectiveness of the suggested methodology. The findings indicate that our approach outperforms conventional Non-Maximum Suppression (NMS) methods. Particularly, we observed a significant gain of over 5% in terms of accuracy for cars in camera overlap regions. These results highlight the potential of our upgraded detection and re-identification system in practical scenarios for autonomous driving
Classification
subjects
Computer Science
Electronics
Industrial Engineering
Mechanical Engineering
Robotics and Industrial Informatics
Telecommunications
keywords
3d object detection; multi-camera setup; siamese network; non-maxima suppression