Details
Original language | English |
---|---|
Title of host publication | CIKM '18 |
Subtitle of host publication | Proceedings of the 27th ACM International Conference on Information and Knowledge Management |
Editors | Norman Paton, Selcuk Candan, Haixun Wang, James Allan, Rakesh Agrawal, Alexandros Labrinidis, Alfredo Cuzzocrea, Mohammed Zaki, Divesh Srivastava, Andrei Broder, Assaf Schuster |
Pages | 527-536 |
Number of pages | 10 |
ISBN (electronic) | 9781450360142 |
Publication status | Published - 17 Oct 2018 |
Event | 27th ACM International Conference on Information and Knowledge Management, CIKM 2018 - Torino, Italy Duration: 22 Oct 2018 → 26 Oct 2018 |
Abstract
Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.
Keywords
- Concept Drifts, Ensemble Learning, Feature Drifts, Textual Streams, Time Series
ASJC Scopus subject areas
- Decision Sciences(all)
- General Decision Sciences
- Business, Management and Accounting(all)
- General Business,Management and Accounting
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ed. / Norman Paton; Selcuk Candan; Haixun Wang; James Allan; Rakesh Agrawal; Alexandros Labrinidis; Alfredo Cuzzocrea; Mohammed Zaki; Divesh Srivastava; Andrei Broder; Assaf Schuster. 2018. p. 527-536.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Learning under Feature Drifts in Textual Streams
AU - Melidis, Damianos P.
AU - Spiliopoulou, Myra
AU - Ntoutsi, Eirini
N1 - Funding information: The work of the first author was supported by the German Research Foundation (DFG) within the project OSCAR (Opinion Stream Classification with Ensembles and Active leaRners) for which the last two authors are the project’s principal investigators. The last author and this work is partially supported by the European Commission within the ERC Advanced Grant ALEXANDRIA under grant No.
PY - 2018/10/17
Y1 - 2018/10/17
N2 - Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.
AB - Huge amounts of textual streams are generated nowadays, especially in social networks like Twitter and Facebook. As the discussion topics and user opinions on those topics change drastically with time, those streams undergo changes in data distribution, leading to changes in the concept to be learned, a phenomenon called concept drift. One particular type of drift, that has not yet attracted a lot of attention is feature drift, i.e., changes in the features that are relevant for the learning task at hand. In this work, we propose an approach for handling feature drifts in textual streams. Our approach integrates i) an ensemble-based mechanism to accurately predict the feature/word values for the next time-point by taking into account the different features might be subject to different temporal trends and ii) a sketch-based feature space maintenance mechanism that allows for a memory-bounded maintenance of the feature space over the stream. Experiments with textual streams from the sentiment analysis, email preference and spam detection demonstrate that our approach achieves significantly better or competitive performance compared to baselines.
KW - Concept Drifts
KW - Ensemble Learning
KW - Feature Drifts
KW - Textual Streams
KW - Time Series
UR - http://www.scopus.com/inward/record.url?scp=85058012535&partnerID=8YFLogxK
U2 - 10.1145/3269206.3271717
DO - 10.1145/3269206.3271717
M3 - Conference contribution
AN - SCOPUS:85058012535
SP - 527
EP - 536
BT - CIKM '18
A2 - Paton, Norman
A2 - Candan, Selcuk
A2 - Wang, Haixun
A2 - Allan, James
A2 - Agrawal, Rakesh
A2 - Labrinidis, Alexandros
A2 - Cuzzocrea, Alfredo
A2 - Zaki, Mohammed
A2 - Srivastava, Divesh
A2 - Broder, Andrei
A2 - Schuster, Assaf
T2 - 27th ACM International Conference on Information and Knowledge Management, CIKM 2018
Y2 - 22 October 2018 through 26 October 2018
ER -