Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Advances in Knowledge Discovery and Data Mining |
Untertitel | 25th Pacific-Asia Conference, PAKDD 2021, Virtual Event, May 11–14, 2021, Proceedings, Part I |
Herausgeber/-innen | Kamal Karlapalem, Hong Cheng, Naren Ramakrishnan, R. K. Agrawal, P. Krishna Reddy, Jaideep Srivastava, Tanmoy Chakraborty |
Herausgeber (Verlag) | Springer Science and Business Media Deutschland GmbH |
Seiten | 603-615 |
Seitenumfang | 13 |
ISBN (elektronisch) | 978-3-030-75762-5 |
ISBN (Print) | 9783030757618 |
Publikationsstatus | Veröffentlicht - 9 Mai 2021 |
Veranstaltung | 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2021 - Virtual, Online Dauer: 11 Mai 2021 → 14 Mai 2021 |
Publikationsreihe
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 12712 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (elektronisch) | 1611-3349 |
Abstract
Online learning is one of the trending areas of machine learning in recent years. How to update the model based on new data is the core question in developing an online classifier. When new data arrives, the classifier should keep its model up-to-date by (1) learn new knowledge, (2) keep relevant learned knowledge, and (3) forget obsolete knowledge. This problem becomes more challenging in imbalanced non-stationary scenarios. Previous approaches save arriving instances, then utilize up/down sampling techniques to balance preserved samples and update their models. However, this strategy comes with two drawbacks: first, a delay in updating the models, and second, the up/down sampling causes information loss for the majority classes and introduces noise for the minority classes. To address these drawbacks, we propose the Hyper-Ellipses-Extra-Margin model (HEEM), which properly addresses the class imbalance challenge in online learning by reacting to every new instance as it arrives. HEEM keeps an ensemble of hyper-extended-ellipses for the minority class. Misclassified instances of the majority class are then used to shrink the ellipse, and correctly predicted instances of the minority class are used to enlarge the ellipse. Experimental results show that HEEM mitigates the class imbalance problem and outperforms the state-of-the-art methods.
ASJC Scopus Sachgebiete
- Mathematik (insg.)
- Theoretische Informatik
- Informatik (insg.)
- Allgemeine Computerwissenschaft
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Advances in Knowledge Discovery and Data Mining : 25th Pacific-Asia Conference, PAKDD 2021, Virtual Event, May 11–14, 2021, Proceedings, Part I. Hrsg. / Kamal Karlapalem; Hong Cheng; Naren Ramakrishnan; R. K. Agrawal; P. Krishna Reddy; Jaideep Srivastava; Tanmoy Chakraborty. Springer Science and Business Media Deutschland GmbH, 2021. S. 603-615 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 12712 LNAI).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - An Online Learning Algorithm for Non-stationary Imbalanced Data by Extra-Charging Minority Class
AU - Siahroudi, Sajjad Kamali
AU - Kudenko, Daniel
PY - 2021/5/9
Y1 - 2021/5/9
N2 - Online learning is one of the trending areas of machine learning in recent years. How to update the model based on new data is the core question in developing an online classifier. When new data arrives, the classifier should keep its model up-to-date by (1) learn new knowledge, (2) keep relevant learned knowledge, and (3) forget obsolete knowledge. This problem becomes more challenging in imbalanced non-stationary scenarios. Previous approaches save arriving instances, then utilize up/down sampling techniques to balance preserved samples and update their models. However, this strategy comes with two drawbacks: first, a delay in updating the models, and second, the up/down sampling causes information loss for the majority classes and introduces noise for the minority classes. To address these drawbacks, we propose the Hyper-Ellipses-Extra-Margin model (HEEM), which properly addresses the class imbalance challenge in online learning by reacting to every new instance as it arrives. HEEM keeps an ensemble of hyper-extended-ellipses for the minority class. Misclassified instances of the majority class are then used to shrink the ellipse, and correctly predicted instances of the minority class are used to enlarge the ellipse. Experimental results show that HEEM mitigates the class imbalance problem and outperforms the state-of-the-art methods.
AB - Online learning is one of the trending areas of machine learning in recent years. How to update the model based on new data is the core question in developing an online classifier. When new data arrives, the classifier should keep its model up-to-date by (1) learn new knowledge, (2) keep relevant learned knowledge, and (3) forget obsolete knowledge. This problem becomes more challenging in imbalanced non-stationary scenarios. Previous approaches save arriving instances, then utilize up/down sampling techniques to balance preserved samples and update their models. However, this strategy comes with two drawbacks: first, a delay in updating the models, and second, the up/down sampling causes information loss for the majority classes and introduces noise for the minority classes. To address these drawbacks, we propose the Hyper-Ellipses-Extra-Margin model (HEEM), which properly addresses the class imbalance challenge in online learning by reacting to every new instance as it arrives. HEEM keeps an ensemble of hyper-extended-ellipses for the minority class. Misclassified instances of the majority class are then used to shrink the ellipse, and correctly predicted instances of the minority class are used to enlarge the ellipse. Experimental results show that HEEM mitigates the class imbalance problem and outperforms the state-of-the-art methods.
KW - Imbalanced data
KW - Nonstationary data
KW - Online learning
UR - http://www.scopus.com/inward/record.url?scp=85111098740&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-75762-5_48
DO - 10.1007/978-3-030-75762-5_48
M3 - Conference contribution
AN - SCOPUS:85111098740
SN - 9783030757618
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 603
EP - 615
BT - Advances in Knowledge Discovery and Data Mining
A2 - Karlapalem, Kamal
A2 - Cheng, Hong
A2 - Ramakrishnan, Naren
A2 - Agrawal, R. K.
A2 - Reddy, P. Krishna
A2 - Srivastava, Jaideep
A2 - Chakraborty, Tanmoy
PB - Springer Science and Business Media Deutschland GmbH
T2 - 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2021
Y2 - 11 May 2021 through 14 May 2021
ER -