Details
Original language | English |
---|---|
Pages (from-to) | 167-187 |
Number of pages | 21 |
Journal | Statistical Analysis and Data Mining |
Volume | 11 |
Issue number | 4 |
Early online date | 25 Jun 2018 |
Publication status | Published - 12 Jul 2018 |
Abstract
Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.
Keywords
- data stream clustering, data streams, evaluation, experimental, survey
ASJC Scopus subject areas
- Mathematics(all)
- Analysis
- Computer Science(all)
- Information Systems
- Computer Science(all)
- Computer Science Applications
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Statistical Analysis and Data Mining, Vol. 11, No. 4, 12.07.2018, p. 167-187.
Research output: Contribution to journal › Review article › Research › peer review
}
TY - JOUR
T1 - An evaluation of data stream clustering algorithms
AU - Mansalis, Stratos
AU - Ntoutsi, Eirini
AU - Pelekis, Nikos
AU - Theodoridis, Yannis
N1 - Publisher Copyright: © 2018 Wiley Periodicals, Inc. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/7/12
Y1 - 2018/7/12
N2 - Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.
AB - Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.
KW - data stream clustering
KW - data streams
KW - evaluation
KW - experimental
KW - survey
UR - http://www.scopus.com/inward/record.url?scp=85049771982&partnerID=8YFLogxK
U2 - 10.1002/sam.11380
DO - 10.1002/sam.11380
M3 - Review article
AN - SCOPUS:85049771982
VL - 11
SP - 167
EP - 187
JO - Statistical Analysis and Data Mining
JF - Statistical Analysis and Data Mining
SN - 1932-1864
IS - 4
ER -