Characterizing poisoning attacks on generalistic multi-modal AI models

Recent breakthroughs in transformers, multimodality, and multitasking have paved the way for the emergence of Generalistic Artificial Intelligence (GAI) models. While poisoning attacks are well studied in traditional AI models, they have not been characterized for multi-modal GAI ones. Therefore, this paper addresses representative data poisoning techniques (label manipulation and backdooring) across different text-, image- and
video-based tasks. Results show that poisoning can be stealthily spread through unrelated tasks while preserving the overall model performance. Label flipping can be used to change words inside the knowledge of the model, maintaining the same levels of effect across all the tasks. Backdoor attacks are transferred from one task to another with just a 5% of poison. At the same time, the effect depends on the task – image question answering gets significantly affected, with an attack success rate of 21 % and 76% for visual and textual backdoors, respectively.

Characterizing poisoning attacks on generalistic multi-modal AI models Articles