Active feature acquisition on data streams under feature drift

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

  • Christian Beyer
  • Maik Büttner
  • Vishnu Unnikrishnan
  • Miro Schleicher
  • Eirini Ntoutsi
  • Myra Spiliopoulou

Organisationseinheiten

Externe Organisationen

  • Otto-von-Guericke-Universität Magdeburg
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)597-611
Seitenumfang15
FachzeitschriftAnnales des Telecommunications/Annals of Telecommunications
Jahrgang75
Ausgabenummer9-10
Frühes Online-Datum8 Juli 2020
PublikationsstatusVeröffentlicht - Okt. 2020

Abstract

Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.

ASJC Scopus Sachgebiete

Zitieren

Active feature acquisition on data streams under feature drift. / Beyer, Christian; Büttner, Maik; Unnikrishnan, Vishnu et al.
in: Annales des Telecommunications/Annals of Telecommunications, Jahrgang 75, Nr. 9-10, 10.2020, S. 597-611.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Beyer, C, Büttner, M, Unnikrishnan, V, Schleicher, M, Ntoutsi, E & Spiliopoulou, M 2020, 'Active feature acquisition on data streams under feature drift', Annales des Telecommunications/Annals of Telecommunications, Jg. 75, Nr. 9-10, S. 597-611. https://doi.org/10.1007/s12243-020-00775-2
Beyer, C., Büttner, M., Unnikrishnan, V., Schleicher, M., Ntoutsi, E., & Spiliopoulou, M. (2020). Active feature acquisition on data streams under feature drift. Annales des Telecommunications/Annals of Telecommunications, 75(9-10), 597-611. https://doi.org/10.1007/s12243-020-00775-2
Beyer C, Büttner M, Unnikrishnan V, Schleicher M, Ntoutsi E, Spiliopoulou M. Active feature acquisition on data streams under feature drift. Annales des Telecommunications/Annals of Telecommunications. 2020 Okt;75(9-10):597-611. Epub 2020 Jul 8. doi: 10.1007/s12243-020-00775-2
Beyer, Christian ; Büttner, Maik ; Unnikrishnan, Vishnu et al. / Active feature acquisition on data streams under feature drift. in: Annales des Telecommunications/Annals of Telecommunications. 2020 ; Jahrgang 75, Nr. 9-10. S. 597-611.
Download
@article{b74aeacbcfb84386a76a8301fce9b010,
title = "Active feature acquisition on data streams under feature drift",
abstract = "Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.",
keywords = "Active feature acquisition, Data streams, Feature drift",
author = "Christian Beyer and Maik B{\"u}ttner and Vishnu Unnikrishnan and Miro Schleicher and Eirini Ntoutsi and Myra Spiliopoulou",
note = "Funding information: Open Access funding provided by Projekt DEAL. This work is partially funded by the German Research Foundation, project OSCAR “Opinion Stream Classification with Ensembles and Active Learners.” The principal investigators of OSCAR are Myra Spiliopoulou and Eirini Ntoutsi. Additionally, Christian Beyer is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.",
year = "2020",
month = oct,
doi = "10.1007/s12243-020-00775-2",
language = "English",
volume = "75",
pages = "597--611",
journal = "Annales des Telecommunications/Annals of Telecommunications",
issn = "0003-4347",
publisher = "Springer Paris",
number = "9-10",

}

Download

TY - JOUR

T1 - Active feature acquisition on data streams under feature drift

AU - Beyer, Christian

AU - Büttner, Maik

AU - Unnikrishnan, Vishnu

AU - Schleicher, Miro

AU - Ntoutsi, Eirini

AU - Spiliopoulou, Myra

N1 - Funding information: Open Access funding provided by Projekt DEAL. This work is partially funded by the German Research Foundation, project OSCAR “Opinion Stream Classification with Ensembles and Active Learners.” The principal investigators of OSCAR are Myra Spiliopoulou and Eirini Ntoutsi. Additionally, Christian Beyer is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.

PY - 2020/10

Y1 - 2020/10

N2 - Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.

AB - Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.

KW - Active feature acquisition

KW - Data streams

KW - Feature drift

UR - http://www.scopus.com/inward/record.url?scp=85087713186&partnerID=8YFLogxK

U2 - 10.1007/s12243-020-00775-2

DO - 10.1007/s12243-020-00775-2

M3 - Article

AN - SCOPUS:85087713186

VL - 75

SP - 597

EP - 611

JO - Annales des Telecommunications/Annals of Telecommunications

JF - Annales des Telecommunications/Annals of Telecommunications

SN - 0003-4347

IS - 9-10

ER -