Turning user generated health-related content into actionable knowledge through text analytics services

In the last years, the habit of discussing healthcare issues with family and friends, even with unknown people, in the context of social networks has increased and processing user generated content has become a new challenge. This can help in on-line crowd surveillance for different applications (pharmacovigilance and filtering health contents in blogs among others) as well as extracting knowledge from unstructured text sources. In this article, a system that monitors health social media streams is described. It is based on several text analytics processes supported, among others, by MeaningCloud, a commercial platform which provides meaning extraction from texts in a Software as a Service mode. In this architecture, several domain resources are integrated to detect drugs and drug effects such as CIMA (official information about authorized drugs in Spain maintained by the Spanish Agency of Medicines and Health Products), MedDRA (Medical Dictionary for Regulatory Activities) and the SpanishDrugEffectDB database that contains relations between drugs and effects. Different ways of visualizing data considering time lines and aggregated data have been implemented. In order to show performance, an evaluation has been carried out over Named Entities Recognition (NER) and Relation Extraction (RE) tasks.

Turning user generated health-related content into actionable knowledge through text analytics services Articles