Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?

Research output: Working paper/PreprintPreprint

Authors

  • Avaré Stewart
  • Sara Romano
  • Nattiya Kanhabua
  • Sergio Di Martino
  • Wolf Siberski
  • Antonino Mazzeo
  • Wolfgang Nejdl
  • Ernesto Diaz-Aviles

Research Organisations

External Research Organisations

  • Monte S. Angelo University Federico II
View graph of relations

Details

Original languageEnglish
Number of pages20
Publication statusE-pub ahead of print - 10 Nov 2016

Abstract

Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability of social media services to transmit information faster than traditional channels. However, the majority of current studies have limited their scope to the detection of common and seasonal health recurring events (e.g., Influenza-like Illness), partially due to the noisy nature of Twitter data, which makes outbreak detection and management very challenging. Within the European project M-Eco, we developed a Twitter-based Epidemic Intelligence (EI) system, which is designed to also handle a more general class of unexpected and aperiodic outbreaks. In particular, we faced three main research challenges in this endeavor: 1) dynamic classification to manage terminology evolution of Twitter messages, 2) alert generation to produce reliable outbreak alerts analyzing the (noisy) tweet time series, and 3) ranking and recommendation to support domain experts for better assessment of the generated alerts. In this paper, we empirically evaluate our proposed approach to these challenges using real-world outbreak datasets and a large collection of tweets. We validate our solution with domain experts, describe our experiences, and give a more realistic view on the benefits and issues of analyzing social media for public health.

Keywords

    cs.CY, cs.IR, cs.SI, stat.ML

Cite this

Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter? / Stewart, Avaré; Romano, Sara; Kanhabua, Nattiya et al.
2016.

Research output: Working paper/PreprintPreprint

Stewart, A, Romano, S, Kanhabua, N, Martino, SD, Siberski, W, Mazzeo, A, Nejdl, W & Diaz-Aviles, E 2016 'Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?'. <https://arxiv.org/abs/1611.03426>
Stewart, A., Romano, S., Kanhabua, N., Martino, S. D., Siberski, W., Mazzeo, A., Nejdl, W., & Diaz-Aviles, E. (2016). Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter? Advance online publication. https://arxiv.org/abs/1611.03426
Stewart A, Romano S, Kanhabua N, Martino SD, Siberski W, Mazzeo A et al. Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter? 2016 Nov 10. Epub 2016 Nov 10.
Stewart, Avaré ; Romano, Sara ; Kanhabua, Nattiya et al. / Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?. 2016.
Download
@techreport{20013d6574cc4268a28d31cfdeb63bd9,
title = "Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?",
abstract = " Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability of social media services to transmit information faster than traditional channels. However, the majority of current studies have limited their scope to the detection of common and seasonal health recurring events (e.g., Influenza-like Illness), partially due to the noisy nature of Twitter data, which makes outbreak detection and management very challenging. Within the European project M-Eco, we developed a Twitter-based Epidemic Intelligence (EI) system, which is designed to also handle a more general class of unexpected and aperiodic outbreaks. In particular, we faced three main research challenges in this endeavor: 1) dynamic classification to manage terminology evolution of Twitter messages, 2) alert generation to produce reliable outbreak alerts analyzing the (noisy) tweet time series, and 3) ranking and recommendation to support domain experts for better assessment of the generated alerts. In this paper, we empirically evaluate our proposed approach to these challenges using real-world outbreak datasets and a large collection of tweets. We validate our solution with domain experts, describe our experiences, and give a more realistic view on the benefits and issues of analyzing social media for public health. ",
keywords = "cs.CY, cs.IR, cs.SI, stat.ML",
author = "Avar{\'e} Stewart and Sara Romano and Nattiya Kanhabua and Martino, {Sergio Di} and Wolf Siberski and Antonino Mazzeo and Wolfgang Nejdl and Ernesto Diaz-Aviles",
note = "ACM CCS Concepts: Applied computing - Health informatics; Information systems - Web mining; Document filtering; Novelty in information retrieval; Recommender systems; Human-centered computing - Social media",
year = "2016",
month = nov,
day = "10",
language = "English",
type = "WorkingPaper",

}

Download

TY - UNPB

T1 - Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?

AU - Stewart, Avaré

AU - Romano, Sara

AU - Kanhabua, Nattiya

AU - Martino, Sergio Di

AU - Siberski, Wolf

AU - Mazzeo, Antonino

AU - Nejdl, Wolfgang

AU - Diaz-Aviles, Ernesto

N1 - ACM CCS Concepts: Applied computing - Health informatics; Information systems - Web mining; Document filtering; Novelty in information retrieval; Recommender systems; Human-centered computing - Social media

PY - 2016/11/10

Y1 - 2016/11/10

N2 - Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability of social media services to transmit information faster than traditional channels. However, the majority of current studies have limited their scope to the detection of common and seasonal health recurring events (e.g., Influenza-like Illness), partially due to the noisy nature of Twitter data, which makes outbreak detection and management very challenging. Within the European project M-Eco, we developed a Twitter-based Epidemic Intelligence (EI) system, which is designed to also handle a more general class of unexpected and aperiodic outbreaks. In particular, we faced three main research challenges in this endeavor: 1) dynamic classification to manage terminology evolution of Twitter messages, 2) alert generation to produce reliable outbreak alerts analyzing the (noisy) tweet time series, and 3) ranking and recommendation to support domain experts for better assessment of the generated alerts. In this paper, we empirically evaluate our proposed approach to these challenges using real-world outbreak datasets and a large collection of tweets. We validate our solution with domain experts, describe our experiences, and give a more realistic view on the benefits and issues of analyzing social media for public health.

AB - Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability of social media services to transmit information faster than traditional channels. However, the majority of current studies have limited their scope to the detection of common and seasonal health recurring events (e.g., Influenza-like Illness), partially due to the noisy nature of Twitter data, which makes outbreak detection and management very challenging. Within the European project M-Eco, we developed a Twitter-based Epidemic Intelligence (EI) system, which is designed to also handle a more general class of unexpected and aperiodic outbreaks. In particular, we faced three main research challenges in this endeavor: 1) dynamic classification to manage terminology evolution of Twitter messages, 2) alert generation to produce reliable outbreak alerts analyzing the (noisy) tweet time series, and 3) ranking and recommendation to support domain experts for better assessment of the generated alerts. In this paper, we empirically evaluate our proposed approach to these challenges using real-world outbreak datasets and a large collection of tweets. We validate our solution with domain experts, describe our experiences, and give a more realistic view on the benefits and issues of analyzing social media for public health.

KW - cs.CY

KW - cs.IR

KW - cs.SI

KW - stat.ML

M3 - Preprint

BT - Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?

ER -

By the same author(s)