DAttNet: monocular depth estimation network based on attention mechanisms Articles uri icon

publication date

  • December 2023

start page

  • 3347

end page

  • 3356

volume

  • 36

International Standard Serial Number (ISSN)

  • 0941-0643

Electronic International Standard Serial Number (EISSN)

  • 1433-3058

abstract

  • As autonomous vehicles get closer to our daily lives, the need for architectures that function as redundant pipelines is becoming increasingly critical. To address this issue without compromising the budget, researchers aim to avoid duplicating high-cost sensors such as LiDARs. In this work, we propose using monocular cameras, which are already essential for some modules of the autonomous platform, for 3D scene understanding. While many methods for depth estimation using single images have been proposed in the literature, they usually rely on complex neural network ensembles that extract dense feature maps, resulting in a high computational cost. Instead, we propose a novel and inherently efficient method for obtaining depth images that replace tangled neural architectures with attention mechanisms applied to basic encoder decoder models. We evaluate our method on the KITTI public dataset and in real-world experiments on our automated vehicle. The obtained results prove the viability of our approach, which can compete with intricate state-of-the-art methods while outperforming most alternatives based on attention mechanisms.

subjects

  • Robotics and Industrial Informatics

keywords

  • depth estimation; deep learning; attention layers; autonomous driving