Learning under Feature Drifts in Textual Streams

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Damianos P. Melidis
  • Myra Spiliopoulou
  • Eirini Ntoutsi

External Research Organisations

  • Otto-von-Guericke University Magdeburg
View graph of relations

Details

Original languageEnglish
Title of host publicationCIKM '18
Subtitle of host publicationProceedings of the 27th ACM International Conference on Information and Knowledge Management
EditorsNorman Paton, Selcuk Candan, Haixun Wang, James Allan, Rakesh Agrawal, Alexandros Labrinidis, Alfredo Cuzzocrea, Mohammed Zaki, Divesh Srivastava, Andrei Broder, Assaf Schuster
Pages527-536
Number of pages10
ISBN (electronic)9781450360142
Publication statusPublished - 17 Oct 2018
Event27th ACM International Conference on Information and Knowledge Management, CIKM 2018 - Torino, Italy
Duration: 22 Oct 201826 Oct 2018

Abstract

Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.

Keywords

    Concept Drifts, Ensemble Learning, Feature Drifts, Textual Streams, Time Series

ASJC Scopus subject areas

Cite this

Learning under Feature Drifts in Textual Streams. / Melidis, Damianos P.; Spiliopoulou, Myra; Ntoutsi, Eirini.
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ed. / Norman Paton; Selcuk Candan; Haixun Wang; James Allan; Rakesh Agrawal; Alexandros Labrinidis; Alfredo Cuzzocrea; Mohammed Zaki; Divesh Srivastava; Andrei Broder; Assaf Schuster. 2018. p. 527-536.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Melidis, DP, Spiliopoulou, M & Ntoutsi, E 2018, Learning under Feature Drifts in Textual Streams. in N Paton, S Candan, H Wang, J Allan, R Agrawal, A Labrinidis, A Cuzzocrea, M Zaki, D Srivastava, A Broder & A Schuster (eds), CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. pp. 527-536, 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, 22 Oct 2018. https://doi.org/10.1145/3269206.3271717
Melidis, D. P., Spiliopoulou, M., & Ntoutsi, E. (2018). Learning under Feature Drifts in Textual Streams. In N. Paton, S. Candan, H. Wang, J. Allan, R. Agrawal, A. Labrinidis, A. Cuzzocrea, M. Zaki, D. Srivastava, A. Broder, & A. Schuster (Eds.), CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 527-536) https://doi.org/10.1145/3269206.3271717
Melidis DP, Spiliopoulou M, Ntoutsi E. Learning under Feature Drifts in Textual Streams. In Paton N, Candan S, Wang H, Allan J, Agrawal R, Labrinidis A, Cuzzocrea A, Zaki M, Srivastava D, Broder A, Schuster A, editors, CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018. p. 527-536 doi: 10.1145/3269206.3271717
Melidis, Damianos P. ; Spiliopoulou, Myra ; Ntoutsi, Eirini. / Learning under Feature Drifts in Textual Streams. CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. editor / Norman Paton ; Selcuk Candan ; Haixun Wang ; James Allan ; Rakesh Agrawal ; Alexandros Labrinidis ; Alfredo Cuzzocrea ; Mohammed Zaki ; Divesh Srivastava ; Andrei Broder ; Assaf Schuster. 2018. pp. 527-536
Download
@inproceedings{d4cc414e3eb146b08aa4a153317dd62c,
title = "Learning under Feature Drifts in Textual Streams",
abstract = "Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.",
keywords = "Concept Drifts, Ensemble Learning, Feature Drifts, Textual Streams, Time Series",
author = "Melidis, {Damianos P.} and Myra Spiliopoulou and Eirini Ntoutsi",
note = "Funding information: The work of the first author was supported by the German Research Foundation (DFG) within the project OSCAR (Opinion Stream Classification with Ensembles and Active leaRners) for which the last two authors are the project{\textquoteright}s principal investigators. The last author and this work is partially supported by the European Commission within the ERC Advanced Grant ALEXANDRIA under grant No.; 27th ACM International Conference on Information and Knowledge Management, CIKM 2018 ; Conference date: 22-10-2018 Through 26-10-2018",
year = "2018",
month = oct,
day = "17",
doi = "10.1145/3269206.3271717",
language = "English",
pages = "527--536",
editor = "Norman Paton and Selcuk Candan and Haixun Wang and James Allan and Rakesh Agrawal and Alexandros Labrinidis and Alfredo Cuzzocrea and Mohammed Zaki and Divesh Srivastava and Andrei Broder and Assaf Schuster",
booktitle = "CIKM '18",

}

Download

TY - GEN

T1 - Learning under Feature Drifts in Textual Streams

AU - Melidis, Damianos P.

AU - Spiliopoulou, Myra

AU - Ntoutsi, Eirini

N1 - Funding information: The work of the first author was supported by the German Research Foundation (DFG) within the project OSCAR (Opinion Stream Classification with Ensembles and Active leaRners) for which the last two authors are the project’s principal investigators. The last author and this work is partially supported by the European Commission within the ERC Advanced Grant ALEXANDRIA under grant No.

PY - 2018/10/17

Y1 - 2018/10/17

N2 - Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.

AB - Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.

KW - Concept Drifts

KW - Ensemble Learning

KW - Feature Drifts

KW - Textual Streams

KW - Time Series

UR - http://www.scopus.com/inward/record.url?scp=85058012535&partnerID=8YFLogxK

U2 - 10.1145/3269206.3271717

DO - 10.1145/3269206.3271717

M3 - Conference contribution

AN - SCOPUS:85058012535

SP - 527

EP - 536

BT - CIKM '18

A2 - Paton, Norman

A2 - Candan, Selcuk

A2 - Wang, Haixun

A2 - Allan, James

A2 - Agrawal, Rakesh

A2 - Labrinidis, Alexandros

A2 - Cuzzocrea, Alfredo

A2 - Zaki, Mohammed

A2 - Srivastava, Divesh

A2 - Broder, Andrei

A2 - Schuster, Assaf

T2 - 27th ACM International Conference on Information and Knowledge Management, CIKM 2018

Y2 - 22 October 2018 through 26 October 2018

ER -