Learning adversarial attack policies through multi-objective reinforcement learning

authors

GARCIA POLO, FRANCISCO JAVIER
MAJADAS SANZ, RUBEN
FERNANDEZ REBOLLO, FERNANDO

published in

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE Journal

publication date

November 2020

start page

1

end page

11

volume

96

Digital Object Identifier (DOI)

https://doi.org/10.1016/j.engappai.2020.104021

full text

https://hdl.handle.net/10016/44049

International Standard Serial Number (ISSN)

0952-1976

Electronic International Standard Serial Number (EISSN)

1873-6769

abstract

Deep Reinforcement Learning has shown promising results in learning policies for complex sequential decision-making tasks. However, different adversarial attack strategies have revealed the weakness of these policies toperturbations to their observations. Most of these attacks have been built on existing adversarial examplecrafting techniques used to fool classifiers, where an adversarial attack is considered a success if it makes the classifier outputs any wrong class. The major drawback of these approaches when applied to decision-makingtasks is that they are blind for long-term goals. In contrast, this paper suggests that it is more appropriate toview the attack process as a sequential optimization problem, with the aim of learning a sequence of attacks, where the attacker must consider the long-term effects of each attack. In this paper, we propose that suchan attack policy must be learned with two objectives in view. On the one hand, the attack must pursue themaximum performance loss of the attacked policy. On the other hand, it also should minimize the cost ofthe attacks. Therefore, in this paper we propose a novel modelization of the process of learning an attackpolicy as a Multi-objective Markov Decision Process with two objectives: maximizing the performance loss of the attacked policy and minimizing the cost of the attacks. We also reveal the conflicting nature of thesetwo objectives and use a Multi-objective Reinforcement Learning algorithm to draw the Pareto fronts for four wel-known tasks: the GridWorld, the Cartpole, the Mountain car and the Breakout.

Learning adversarial attack policies through multi-objective reinforcement learning Articles

Overview

authors

published in

publication date

start page

end page

volume

Digital Object Identifier (DOI)

full text

International Standard Serial Number (ISSN)

Electronic International Standard Serial Number (EISSN)

abstract

Classification

keywords