Details
Original language | English |
---|---|
Pages (from-to) | 597-611 |
Number of pages | 15 |
Journal | Annales des Telecommunications/Annals of Telecommunications |
Volume | 75 |
Issue number | 9-10 |
Early online date | 8 Jul 2020 |
Publication status | Published - Oct 2020 |
Abstract
Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.
Keywords
- Active feature acquisition, Data streams, Feature drift
ASJC Scopus subject areas
- Engineering(all)
- Electrical and Electronic Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Annales des Telecommunications/Annals of Telecommunications, Vol. 75, No. 9-10, 10.2020, p. 597-611.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Active feature acquisition on data streams under feature drift
AU - Beyer, Christian
AU - Büttner, Maik
AU - Unnikrishnan, Vishnu
AU - Schleicher, Miro
AU - Ntoutsi, Eirini
AU - Spiliopoulou, Myra
N1 - Funding information: Open Access funding provided by Projekt DEAL. This work is partially funded by the German Research Foundation, project OSCAR “Opinion Stream Classification with Ensembles and Active Learners.” The principal investigators of OSCAR are Myra Spiliopoulou and Eirini Ntoutsi. Additionally, Christian Beyer is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.
PY - 2020/10
Y1 - 2020/10
N2 - Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.
AB - Traditional active learning tries to identify instances for which the acquisition of the label increases model performance under budget constraints. Less research has been devoted to the task of actively acquiring feature values, whereupon both the instance and the feature must be selected intelligently and even less to a scenario where the instances arrive in a stream with feature drift. We propose an active feature acquisition strategy for data streams with feature drift, as well as an active feature acquisition evaluation framework. We also implement a baseline that chooses features randomly and compare the random approach against eight different methods in a scenario where we can acquire at most one feature at the time per instance and where all features are considered to cost the same. Our initial experiments on 9 different data sets, with 7 different degrees of missing features and 8 different budgets show that our developed methods outperform the random acquisition on 7 data sets and have a comparable performance on the remaining two.
KW - Active feature acquisition
KW - Data streams
KW - Feature drift
UR - http://www.scopus.com/inward/record.url?scp=85087713186&partnerID=8YFLogxK
U2 - 10.1007/s12243-020-00775-2
DO - 10.1007/s12243-020-00775-2
M3 - Article
AN - SCOPUS:85087713186
VL - 75
SP - 597
EP - 611
JO - Annales des Telecommunications/Annals of Telecommunications
JF - Annales des Telecommunications/Annals of Telecommunications
SN - 0003-4347
IS - 9-10
ER -