Network-level aircraft trajectory planning via multi-agent deep reinforcement learning: Balancing climate considerations and operational manageability Articles uri icon

publication date

  • May 2025

start page

  • 1

end page

  • 16

issue

  • 126604

volume

  • 271

International Standard Serial Number (ISSN)

  • 0957-4174

Electronic International Standard Serial Number (EISSN)

  • 1873-6793

abstract

  • Optimizing flight trajectories emerges as a viable strategy to mitigate the non-CO2 climate impacts of aviation. However, integrating individually optimized trajectories into the air traffic management system poses operational challenges, notably in terms of traffic safety and complexity. This paper presents a novel cooperative decision-making framework employing multi-agent deep reinforcement learning to plan operationally feasible climate-friendly routes from the perspective of the air traffic management system. The proposed strategy leverages the twin delayed deep deterministic policy gradient (TD3) algorithm to adjust flight trajectories during the planning phase to resolve the potential conflicts associated with climate-optimal trajectories. Addressing the scalability issue inherent in multi-agent environments, we derive a unique policy applicable to arbitrary numbers of concurrently operating aircraft. To handle the non-stationarity of the environment, fully observable critic networks are employed, providing comprehensive situational awareness for each agent during training. The effectiveness of the proposed approach is validated by comparing it against three algorithms and evaluating the derived policy across multiple sets of climate-optimal trajectories over European airspace. The results demonstrate that our framework can effectively mitigate aviation's climate impact while maintaining operational feasibility. Restricting decision space to only speed changes, up to 80% climate impact reduction is achievable while decreasing potential conflicts by 10% compared to standard business-as-usual trajectories. Notably, without applying the proposed method, obtaining similar climate impact mitigation leads to a substantial increase in the number of conflicts. Enhancing the proposed framework by incorporating additional decision variables such as lateral path and altitude adjustments, as well as other ATM performance indicators relevant to the flight planning phase, can further facilitate the practical implementation of climate-friendly trajectories.

subjects

  • Aeronautics

keywords

  • air traffic management system; aircraft trajectory planning; aviation climate impact; conflict resolution; multi-agent deep reinforcement learning; twin delayed deep deterministic policy gradient algorithm