Fine-tuning and evaluation of DialoGPT on several datasets of English movies and TV series subtitles Articles uri icon

publication date

  • March 2023

issue

  • 70

abstract

  • The new streaming platforms have generated a proliferation of movies and series, most of them subtitled. This provides a large number of conversational, less formal, more interactive texts that better reflect communication between human beings. Most of the transformative models developed to date have not been trained with conversational texts. In this article, DialoGPT, a GPT-2 model for the dialog task trained on a collection of Reddit posts, is fine-tuned and evaluated on different collections of English subtitles from popular movies and series. Experiments show that DialoGPT performs well and that English subtitles from movies and series can be an outstanding resource for chatbot development.

keywords

  • chatbot; dialogpt; gpt-2; transformer