Over the last decades, the idea of smart cities has evolved from a visionary concept of the future into a concrete reality. However, the vision of smart cities has not been fully realized within our society, partly due to the challenges encountered in contemporary data collection systems. Despite these obstacles, advancements in deep learning and computer vision have propelled the development of highly accurate detection algorithms capable of obtaining 3D data from image sources. Nevertheless, this approach has predominantly centered on data extraction from a vehicle¿s perspective, bypassing the advantages of using infrastructure-mounted cameras for performing 3D pose estimation of vehicles in urban environments. This paper focuses on leveraging 3D pose estimation from this alternative perspective, benefiting from the enhanced field of view that infrastructure-based cameras provide, avoiding occlusions, and obtaining more information from the objects¿ sizes, leading to better results and more accurate predictions compared to models trained on a vehicle¿s viewpoint. Therefore, this research proposes a new path for exploration, supporting the integration of monocular infrastructure-based data collection systems into smart city development.
Classification
subjects
Robotics and Industrial Informatics
keywords
monocular 3d object detection; smart cities; intelligent infrastructures; deep learning; computer vision