An evaluation of data stream clustering algorithms

Research output: Contribution to journalReview articleResearchpeer review

Authors

  • Stratos Mansalis
  • Eirini Ntoutsi
  • Nikos Pelekis
  • Yannis Theodoridis

Research Organisations

External Research Organisations

  • University of Piraeus
View graph of relations

Details

Original languageEnglish
Pages (from-to)167-187
Number of pages21
JournalStatistical Analysis and Data Mining
Volume11
Issue number4
Early online date25 Jun 2018
Publication statusPublished - 12 Jul 2018

Abstract

Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.

Keywords

    data stream clustering, data streams, evaluation, experimental, survey

ASJC Scopus subject areas

Cite this

An evaluation of data stream clustering algorithms. / Mansalis, Stratos; Ntoutsi, Eirini; Pelekis, Nikos et al.
In: Statistical Analysis and Data Mining, Vol. 11, No. 4, 12.07.2018, p. 167-187.

Research output: Contribution to journalReview articleResearchpeer review

Mansalis, S, Ntoutsi, E, Pelekis, N & Theodoridis, Y 2018, 'An evaluation of data stream clustering algorithms', Statistical Analysis and Data Mining, vol. 11, no. 4, pp. 167-187. https://doi.org/10.1002/sam.11380
Mansalis, S., Ntoutsi, E., Pelekis, N., & Theodoridis, Y. (2018). An evaluation of data stream clustering algorithms. Statistical Analysis and Data Mining, 11(4), 167-187. https://doi.org/10.1002/sam.11380
Mansalis S, Ntoutsi E, Pelekis N, Theodoridis Y. An evaluation of data stream clustering algorithms. Statistical Analysis and Data Mining. 2018 Jul 12;11(4):167-187. Epub 2018 Jun 25. doi: 10.1002/sam.11380
Mansalis, Stratos ; Ntoutsi, Eirini ; Pelekis, Nikos et al. / An evaluation of data stream clustering algorithms. In: Statistical Analysis and Data Mining. 2018 ; Vol. 11, No. 4. pp. 167-187.
Download
@article{81a8c798e85c436485ed1626770f6d1b,
title = "An evaluation of data stream clustering algorithms",
abstract = "Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.",
keywords = "data stream clustering, data streams, evaluation, experimental, survey",
author = "Stratos Mansalis and Eirini Ntoutsi and Nikos Pelekis and Yannis Theodoridis",
note = "Publisher Copyright: {\textcopyright} 2018 Wiley Periodicals, Inc. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.",
year = "2018",
month = jul,
day = "12",
doi = "10.1002/sam.11380",
language = "English",
volume = "11",
pages = "167--187",
journal = "Statistical Analysis and Data Mining",
issn = "1932-1864",
publisher = "John Wiley and Sons Inc.",
number = "4",

}

Download

TY - JOUR

T1 - An evaluation of data stream clustering algorithms

AU - Mansalis, Stratos

AU - Ntoutsi, Eirini

AU - Pelekis, Nikos

AU - Theodoridis, Yannis

N1 - Publisher Copyright: © 2018 Wiley Periodicals, Inc. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.

PY - 2018/7/12

Y1 - 2018/7/12

N2 - Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.

AB - Data stream clustering is a hot research area due to the abundance of data streams collected nowadays and the need for understanding and acting upon such sort of data. Unsupervised learning (clustering) comprises one of the most popular data mining tasks for gaining insights into the data. Clustering is a challenging task, while clustering over data streams involves additional challenges such as the single pass constraint over the raw data and the need for fast response. Moreover, dealing with an infinite and fast changing data stream implies that the clustering model extracted upon such sort of data is also subject to evolution over time. Several stream clustering surveys exist already in the literature; however, they focus on a theoretical presentation of the surveyed algorithms. On the contrary, in this paper, we survey the state-of-the-art stream clustering algorithms and we evaluate their performance in different data sets and for different parameter settings.

KW - data stream clustering

KW - data streams

KW - evaluation

KW - experimental

KW - survey

UR - http://www.scopus.com/inward/record.url?scp=85049771982&partnerID=8YFLogxK

U2 - 10.1002/sam.11380

DO - 10.1002/sam.11380

M3 - Review article

AN - SCOPUS:85049771982

VL - 11

SP - 167

EP - 187

JO - Statistical Analysis and Data Mining

JF - Statistical Analysis and Data Mining

SN - 1932-1864

IS - 4

ER -