Details
Original language | English |
---|---|
Pages (from-to) | 65092-65102 |
Number of pages | 11 |
Journal | IEEE ACCESS |
Volume | 10 |
Publication status | Published - 20 Jun 2022 |
Abstract
In recent years, the data available from IoT devices have increased rapidly. Using a machine learning solution to detect faults in these devices requires the release of device data to a central server. However, these data typically contain sensitive information, leading to the need for privacy-preserving distributed machine learning solutions, such as federated learning, where a model is trained locally on the edge device, and only the trained model weights are shared with a central server. Device failure data are typically imbalanced, i.e., the number of failures is minimal compared to the number of normal samples. Therefore, re-balancing techniques are needed to improve the performance of a machine learning model. In this paper, we present FLY-SMOTE, a new approach to re-balance the data in different non-IID scenarios by generating synthetic data for the minority class in supervised learning tasks using a modified SMOTE method. Our approach takes k samples from the minority class and generates Y new synthetic samples based on one of the nearest neighbors of each $k$ sample. An experimental campaign on a real IoT dataset and three well-known public datasets show that the proposed solution improves the balance accuracy without compromising the model's accuracy.
Keywords
- Federated learning, imbalanced data, IoT, non-IID data
ASJC Scopus subject areas
- Computer Science(all)
- General Computer Science
- Materials Science(all)
- General Materials Science
- Engineering(all)
- General Engineering
- Engineering(all)
- Electrical and Electronic Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: IEEE ACCESS, Vol. 10, 20.06.2022, p. 65092-65102.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - FLY-SMOTE
T2 - Re-Balancing the Non-IID IoT Edge Devices Data in Federated Learning System
AU - Younis, Raneen
AU - Fisichella, Marco
N1 - Funding Information: The work was partially funded by the European Commission for the eXplainable Artificial Intelligence in healthcare Management (xAIM) project, agreement No INEA/CEF/ICT/A2020/2276680.
PY - 2022/6/20
Y1 - 2022/6/20
N2 - In recent years, the data available from IoT devices have increased rapidly. Using a machine learning solution to detect faults in these devices requires the release of device data to a central server. However, these data typically contain sensitive information, leading to the need for privacy-preserving distributed machine learning solutions, such as federated learning, where a model is trained locally on the edge device, and only the trained model weights are shared with a central server. Device failure data are typically imbalanced, i.e., the number of failures is minimal compared to the number of normal samples. Therefore, re-balancing techniques are needed to improve the performance of a machine learning model. In this paper, we present FLY-SMOTE, a new approach to re-balance the data in different non-IID scenarios by generating synthetic data for the minority class in supervised learning tasks using a modified SMOTE method. Our approach takes k samples from the minority class and generates Y new synthetic samples based on one of the nearest neighbors of each $k$ sample. An experimental campaign on a real IoT dataset and three well-known public datasets show that the proposed solution improves the balance accuracy without compromising the model's accuracy.
AB - In recent years, the data available from IoT devices have increased rapidly. Using a machine learning solution to detect faults in these devices requires the release of device data to a central server. However, these data typically contain sensitive information, leading to the need for privacy-preserving distributed machine learning solutions, such as federated learning, where a model is trained locally on the edge device, and only the trained model weights are shared with a central server. Device failure data are typically imbalanced, i.e., the number of failures is minimal compared to the number of normal samples. Therefore, re-balancing techniques are needed to improve the performance of a machine learning model. In this paper, we present FLY-SMOTE, a new approach to re-balance the data in different non-IID scenarios by generating synthetic data for the minority class in supervised learning tasks using a modified SMOTE method. Our approach takes k samples from the minority class and generates Y new synthetic samples based on one of the nearest neighbors of each $k$ sample. An experimental campaign on a real IoT dataset and three well-known public datasets show that the proposed solution improves the balance accuracy without compromising the model's accuracy.
KW - Federated learning
KW - imbalanced data
KW - IoT
KW - non-IID data
UR - http://www.scopus.com/inward/record.url?scp=85133601434&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2022.3184309
DO - 10.1109/ACCESS.2022.3184309
M3 - Article
AN - SCOPUS:85133601434
VL - 10
SP - 65092
EP - 65102
JO - IEEE ACCESS
JF - IEEE ACCESS
SN - 2169-3536
ER -