- November 2021
Digital Object Identifier (DOI)
International Standard Serial Number (ISSN)
- Due to the globalisation of the COVID-19 pandemic, and the expansion of social media as the main source of information for many people, there have been a great variety of different reactions surrounding the topic. The World Health Organization (WHO) announced in December 2020 that they were currently fighting an 'infodemic' in the same way as they were fighting the pandemic. An 'infodemic' relates to the spread of information that is not controlled or filtered, and can have a negative impact on society. If not managed properly, an aggressive or negative tweet can be very harmful and misleading among its recipients. Therefore, authorities at WHO have called for action and asked the academic and scientific community to develop tools for managing the infodemic by the use of digital technologies and data science. The goal of this study is to develop and apply natural language processing models using deep learning to classify a collection of tweets that refer to the COVID-19 pandemic. Several simpler and widely used models are applied first and serve as a benchmark for deep learning methods, such as Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers (BERT). The results of the experiments show that the deep learning models outperform the traditional machine learning algorithms. The best approach is the BERT-based model.
- bert; covid-19; deep learning; long short-term memory; lstm; machine learning; multi-classification; natural language processing; sentiment analysis; v