Active feature acquisition on data streams under feature drift

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Christian Beyer
  • Maik Büttner
  • Vishnu Unnikrishnan
  • Miro Schleicher
  • Eirini Ntoutsi
  • Myra Spiliopoulou

Research Organisations

External Research Organisations

  • Otto-von-Guericke University Magdeburg
View graph of relations

Details

Original languageEnglish
Pages (from-to)597-611
Number of pages15
JournalAnnales des Telecommunications/Annals of Telecommunications
Volume75
Issue number9-10
Early online date8 Jul 2020
Publication statusPublished - Oct 2020

Abstract

Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.

Keywords

    Active feature acquisition, Data streams, Feature drift

ASJC Scopus subject areas

Cite this

Active feature acquisition on data streams under feature drift. / Beyer, Christian; Büttner, Maik; Unnikrishnan, Vishnu et al.
In: Annales des Telecommunications/Annals of Telecommunications, Vol. 75, No. 9-10, 10.2020, p. 597-611.

Research output: Contribution to journalArticleResearchpeer review

Beyer, C, Büttner, M, Unnikrishnan, V, Schleicher, M, Ntoutsi, E & Spiliopoulou, M 2020, 'Active feature acquisition on data streams under feature drift', Annales des Telecommunications/Annals of Telecommunications, vol. 75, no. 9-10, pp. 597-611. https://doi.org/10.1007/s12243-020-00775-2
Beyer, C., Büttner, M., Unnikrishnan, V., Schleicher, M., Ntoutsi, E., & Spiliopoulou, M. (2020). Active feature acquisition on data streams under feature drift. Annales des Telecommunications/Annals of Telecommunications, 75(9-10), 597-611. https://doi.org/10.1007/s12243-020-00775-2
Beyer C, Büttner M, Unnikrishnan V, Schleicher M, Ntoutsi E, Spiliopoulou M. Active feature acquisition on data streams under feature drift. Annales des Telecommunications/Annals of Telecommunications. 2020 Oct;75(9-10):597-611. Epub 2020 Jul 8. doi: 10.1007/s12243-020-00775-2
Beyer, Christian ; Büttner, Maik ; Unnikrishnan, Vishnu et al. / Active feature acquisition on data streams under feature drift. In: Annales des Telecommunications/Annals of Telecommunications. 2020 ; Vol. 75, No. 9-10. pp. 597-611.
Download
@article{b74aeacbcfb84386a76a8301fce9b010,
title = "Active feature acquisition on data streams under feature drift",
abstract = "Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.",
keywords = "Active feature acquisition, Data streams, Feature drift",
author = "Christian Beyer and Maik B{\"u}ttner and Vishnu Unnikrishnan and Miro Schleicher and Eirini Ntoutsi and Myra Spiliopoulou",
note = "Funding information: Open Access funding provided by Projekt DEAL. This work is partially funded by the German Research Foundation, project OSCAR “Opinion Stream Classification with Ensembles and Active Learners.” The principal investigators of OSCAR are Myra Spiliopoulou and Eirini Ntoutsi. Additionally, Christian Beyer is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.",
year = "2020",
month = oct,
doi = "10.1007/s12243-020-00775-2",
language = "English",
volume = "75",
pages = "597--611",
journal = "Annales des Telecommunications/Annals of Telecommunications",
issn = "0003-4347",
publisher = "Springer Paris",
number = "9-10",

}

Download

TY - JOUR

T1 - Active feature acquisition on data streams under feature drift

AU - Beyer, Christian

AU - Büttner, Maik

AU - Unnikrishnan, Vishnu

AU - Schleicher, Miro

AU - Ntoutsi, Eirini

AU - Spiliopoulou, Myra

N1 - Funding information: Open Access funding provided by Projekt DEAL. This work is partially funded by the German Research Foundation, project OSCAR “Opinion Stream Classification with Ensembles and Active Learners.” The principal investigators of OSCAR are Myra Spiliopoulou and Eirini Ntoutsi. Additionally, Christian Beyer is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.

PY - 2020/10

Y1 - 2020/10

N2 - Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.

AB - Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.

KW - Active feature acquisition

KW - Data streams

KW - Feature drift

UR - http://www.scopus.com/inward/record.url?scp=85087713186&partnerID=8YFLogxK

U2 - 10.1007/s12243-020-00775-2

DO - 10.1007/s12243-020-00775-2

M3 - Article

AN - SCOPUS:85087713186

VL - 75

SP - 597

EP - 611

JO - Annales des Telecommunications/Annals of Telecommunications

JF - Annales des Telecommunications/Annals of Telecommunications

SN - 0003-4347

IS - 9-10

ER -