Simulation-based evaluation of model-free reinforcement learning algorithms for quadcopter attitude control and trajectory tracking

General use quadcopters have been under development for over a decade but many of their potential applications are still under evaluation and have not yet been adopted in many of the areas that could benefit from their use. While the current generation of quadcopters use a mature set of control algorithms, the next steps, especially as autonomous features are developed, should involve a more complex learning capability to be able to adapt to unknown circumstances in a safe and reliable way. This paper provides baseline quadcopter control models learnt using eight general reinforcement learning (RL) algorithms in a simulated environment, with the object of establishing a reference performance, both in terms of precision and generation cost, for a simple set of trajectories. Each algorithm uses a tailored set of hyperparameters while, additionally, the influence of random seeds is also studied. While not all algorithms converge in the allocated computing budget, the more complex ones are able to provide stable and precise control models. This paper recommends the use of the TD3 algorithm as a reference for comparison with new RL algorithms. Additional guidance for future work is provided based on the weaknesses identified in the learning process, especially regarding the strong dependence of agent performance on random seeds.

Simulation-based evaluation of model-free reinforcement learning algorithms for quadcopter attitude control and trajectory tracking Articles