Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study

Lorena Poenaru-Olaru; Luis Miranda da Cruz; Arie Van Deursen; Jan S. Rellermeyer

doi:10.48550/arXiv.2211.13098

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
Herausgeber/-innen	Shusaku Tsumoto, Yukio Ohsawa, Lei Chen, Dirk Van den Poel, Xiaohua Hu, Yoichi Motomura, Takuya Takagi, Lingfei Wu, Ying Xie, Akihiro Abe, Vijay Raghavan
Herausgeber (Verlag)	Institute of Electrical and Electronics Engineers Inc.
Seiten	3364-3373
Seitenumfang	10
ISBN (elektronisch)	9781665480451
ISBN (Print)	978-1-6654-8046-8
Publikationsstatus	Veröffentlicht - 2022
Veranstaltung	2022 IEEE International Conference on Big Data, Big Data 2022 - Osaka, Japan Dauer: 17 Dez. 2022 → 20 Dez. 2022

Abstract

As machine learning models increasingly replace traditional business logic in the production system, their lifecycle management is becoming a significant concern. Once deployed into production, the machine learning models are constantly evaluated on new streaming data. Given the continuous data flow, shifting data, also known as concept drift, is ubiquitous in such settings. Concept drift usually impacts the performance of machine learning models, thus, identifying the moment when concept drift occurs is required. Concept drift is identified through concept drift detectors. In this work, we assess the reliability of concept drift detectors to identify drift in time by exploring how late are they reporting drifts and how many false alarms are they signaling. We compare the performance of the most popular drift detectors belonging to two different concept drift detector groups, error rate-based detectors and data distribution-based detectors. We assess their performance on both synthetic and real-world data. In the case of synthetic data, we investigate the performance of detectors to identify two types of concept drift, abrupt and gradual. Our findings aim to help practitioners understand which drift detector should be employed in different situations and, to achieve this, we share a list of the most important observations made throughout this study, which can serve as guidelines for practical usage. Furthermore, based on our empirical results, we analyze the suitability of each concept drift detection group to be used as an alarming system.

ASJC Scopus Sachgebiete

Mathematik (insg.)
Modellierung und Simulation
Informatik (insg.)
Computernetzwerke und -kommunikation
Informatik (insg.)
Information systems
Entscheidungswissenschaften (insg.)
Informationssysteme und -management
Ingenieurwesen (insg.)
Sicherheit, Risiko, Zuverlässigkeit und Qualität
Mathematik (insg.)
Steuerung und Optimierung

Zitieren

Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study. / Poenaru-Olaru, Lorena; Miranda da Cruz, Luis; Van Deursen, Arie et al.
Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022. Hrsg. / Shusaku Tsumoto; Yukio Ohsawa; Lei Chen; Dirk Van den Poel; Xiaohua Hu; Yoichi Motomura; Takuya Takagi; Lingfei Wu; Ying Xie; Akihiro Abe; Vijay Raghavan. Institute of Electrical and Electronics Engineers Inc., 2022. S. 3364-3373.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Poenaru-Olaru, L, Miranda da Cruz, L, Van Deursen, A & Rellermeyer, JS 2022, Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study. in S Tsumoto, Y Ohsawa, L Chen, D Van den Poel, X Hu, Y Motomura, T Takagi, L Wu, Y Xie, A Abe & V Raghavan (Hrsg.), Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022. Institute of Electrical and Electronics Engineers Inc., S. 3364-3373, 2022 IEEE International Conference on Big Data, Big Data 2022, Osaka, Japan, 17 Dez. 2022. https://doi.org/10.48550/arXiv.2211.13098, https://doi.org/10.1109/BigData55660.2022.10020292

Poenaru-Olaru, L., Miranda da Cruz, L., Van Deursen, A., & Rellermeyer, J. S. (2022). Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study. In S. Tsumoto, Y. Ohsawa, L. Chen, D. Van den Poel, X. Hu, Y. Motomura, T. Takagi, L. Wu, Y. Xie, A. Abe, & V. Raghavan (Hrsg.), Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022 (S. 3364-3373). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2211.13098, https://doi.org/10.1109/BigData55660.2022.10020292

Poenaru-Olaru L, Miranda da Cruz L, Van Deursen A, Rellermeyer JS. Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study. in Tsumoto S, Ohsawa Y, Chen L, Van den Poel D, Hu X, Motomura Y, Takagi T, Wu L, Xie Y, Abe A, Raghavan V, Hrsg., Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022. Institute of Electrical and Electronics Engineers Inc. 2022. S. 3364-3373 doi: 10.48550/arXiv.2211.13098, 10.1109/BigData55660.2022.10020292

Poenaru-Olaru, Lorena ; Miranda da Cruz, Luis ; Van Deursen, Arie et al. / Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study. Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022. Hrsg. / Shusaku Tsumoto ; Yukio Ohsawa ; Lei Chen ; Dirk Van den Poel ; Xiaohua Hu ; Yoichi Motomura ; Takuya Takagi ; Lingfei Wu ; Ying Xie ; Akihiro Abe ; Vijay Raghavan. Institute of Electrical and Electronics Engineers Inc., 2022. S. 3364-3373

Download

@inproceedings{641ffe08f8e2489fadb846691ba98ec0,

title = "Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study",

abstract = "As machine learning models increasingly replace traditional business logic in the production system, their lifecycle management is becoming a significant concern. Once deployed into production, the machine learning models are constantly evaluated on new streaming data. Given the continuous data flow, shifting data, also known as concept drift, is ubiquitous in such settings. Concept drift usually impacts the performance of machine learning models, thus, identifying the moment when concept drift occurs is required. Concept drift is identified through concept drift detectors. In this work, we assess the reliability of concept drift detectors to identify drift in time by exploring how late are they reporting drifts and how many false alarms are they signaling. We compare the performance of the most popular drift detectors belonging to two different concept drift detector groups, error rate-based detectors and data distribution-based detectors. We assess their performance on both synthetic and real-world data. In the case of synthetic data, we investigate the performance of detectors to identify two types of concept drift, abrupt and gradual. Our findings aim to help practitioners understand which drift detector should be employed in different situations and, to achieve this, we share a list of the most important observations made throughout this study, which can serve as guidelines for practical usage. Furthermore, based on our empirical results, we analyze the suitability of each concept drift detection group to be used as an alarming system.",

keywords = "concept drift detection, machine learning lifecycle management",

author = "Lorena Poenaru-Olaru and {Miranda da Cruz}, Luis and {Van Deursen}, Arie and Rellermeyer, {Jan S.}",

note = "Funding Information: ACKNOWLEDGMENT This work was partially supported by ING through the AI for Fintech Research Lab with TU Delft. ; 2022 IEEE International Conference on Big Data, Big Data 2022 ; Conference date: 17-12-2022 Through 20-12-2022",

year = "2022",

doi = "10.48550/arXiv.2211.13098",

language = "English",

isbn = "978-1-6654-8046-8",

pages = "3364--3373",

editor = "Shusaku Tsumoto and Yukio Ohsawa and Lei Chen and {Van den Poel}, Dirk and Xiaohua Hu and Yoichi Motomura and Takuya Takagi and Lingfei Wu and Ying Xie and Akihiro Abe and Vijay Raghavan",

booktitle = "Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

Download

TY - GEN

T1 - Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study

AU - Poenaru-Olaru, Lorena

AU - Miranda da Cruz, Luis

AU - Van Deursen, Arie

AU - Rellermeyer, Jan S.

N1 - Funding Information: ACKNOWLEDGMENT This work was partially supported by ING through the AI for Fintech Research Lab with TU Delft.

PY - 2022

Y1 - 2022

N2 - As machine learning models increasingly replace traditional business logic in the production system, their lifecycle management is becoming a significant concern. Once deployed into production, the machine learning models are constantly evaluated on new streaming data. Given the continuous data flow, shifting data, also known as concept drift, is ubiquitous in such settings. Concept drift usually impacts the performance of machine learning models, thus, identifying the moment when concept drift occurs is required. Concept drift is identified through concept drift detectors. In this work, we assess the reliability of concept drift detectors to identify drift in time by exploring how late are they reporting drifts and how many false alarms are they signaling. We compare the performance of the most popular drift detectors belonging to two different concept drift detector groups, error rate-based detectors and data distribution-based detectors. We assess their performance on both synthetic and real-world data. In the case of synthetic data, we investigate the performance of detectors to identify two types of concept drift, abrupt and gradual. Our findings aim to help practitioners understand which drift detector should be employed in different situations and, to achieve this, we share a list of the most important observations made throughout this study, which can serve as guidelines for practical usage. Furthermore, based on our empirical results, we analyze the suitability of each concept drift detection group to be used as an alarming system.

AB - As machine learning models increasingly replace traditional business logic in the production system, their lifecycle management is becoming a significant concern. Once deployed into production, the machine learning models are constantly evaluated on new streaming data. Given the continuous data flow, shifting data, also known as concept drift, is ubiquitous in such settings. Concept drift usually impacts the performance of machine learning models, thus, identifying the moment when concept drift occurs is required. Concept drift is identified through concept drift detectors. In this work, we assess the reliability of concept drift detectors to identify drift in time by exploring how late are they reporting drifts and how many false alarms are they signaling. We compare the performance of the most popular drift detectors belonging to two different concept drift detector groups, error rate-based detectors and data distribution-based detectors. We assess their performance on both synthetic and real-world data. In the case of synthetic data, we investigate the performance of detectors to identify two types of concept drift, abrupt and gradual. Our findings aim to help practitioners understand which drift detector should be employed in different situations and, to achieve this, we share a list of the most important observations made throughout this study, which can serve as guidelines for practical usage. Furthermore, based on our empirical results, we analyze the suitability of each concept drift detection group to be used as an alarming system.

KW - concept drift detection

KW - machine learning lifecycle management

UR - http://www.scopus.com/inward/record.url?scp=85147976931&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2211.13098

DO - 10.48550/arXiv.2211.13098

M3 - Conference contribution

AN - SCOPUS:85147976931

SN - 978-1-6654-8046-8

SP - 3364

EP - 3373

BT - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022

A2 - Tsumoto, Shusaku

A2 - Ohsawa, Yukio

A2 - Chen, Lei

A2 - Van den Poel, Dirk

A2 - Hu, Xiaohua

A2 - Motomura, Yoichi

A2 - Takagi, Takuya

A2 - Wu, Lingfei

A2 - Xie, Ying

A2 - Abe, Akihiro

A2 - Raghavan, Vijay

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2022 IEEE International Conference on Big Data, Big Data 2022

Y2 - 17 December 2022 through 20 December 2022

ER -

Research@Leibniz University

Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Toward Competitive Serverless Deep Learning

The Performance of Distributed Applications: A Traffic Shaping Perspective

Log Parsing Evaluation in the Era of Modern Software Systems

Brug: An Adaptive Memory (Re-)Allocator

Is Your Anomaly Detector Ready for Change? Adapting AIOps Solutions to the Real World

Toward Competitive Serverless Deep Learning

The Performance of Distributed Applications: A Traffic Shaping Perspective

Log Parsing Evaluation in the Era of Modern Software Systems

Brug: An Adaptive Memory (Re-)Allocator

Is Your Anomaly Detector Ready for Change? Adapting AIOps Solutions to the Real World

Toward Competitive Serverless Deep Learning