Details
Original language | English |
---|---|
Title of host publication | 2019 IEEE International Conference on Big Data (Big Data) |
Editors | Chaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1375-1380 |
Number of pages | 6 |
ISBN (electronic) | 9781728108582 |
ISBN (print) | 9781728108599 |
Publication status | Published - 2020 |
Event | 2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States Duration: 9 Dec 2019 → 12 Dec 2019 |
Abstract
Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such biases resulting in more severe discrimination. The majority of thus far proposed fairness-aware machine learning approaches focus solely on the pre-, in-or post-processing steps of the machine learning process, that is, input data, learning algorithms or derived models, respectively. However, the fairness problem cannot be isolated to a single step of the ML process. Rather, discrimination is often a result of complex interactions between big data and algorithms, and therefore, a more holistic approach is required.The proposed FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre-and post-processing steps of the data analysis process. In the pre-processing step, we tackle the problems of under-representation of the protected group (group imbalance) and of class-imbalance by generating balanced training samples. In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
Keywords
- class imbalance, class overlap, ensemble learning, fairness-aware classification, group imbalance
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Computer Science(all)
- Computer Networks and Communications
- Computer Science(all)
- Information Systems
- Decision Sciences(all)
- Information Systems and Management
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2019 IEEE International Conference on Big Data (Big Data). ed. / Chaitanya Baru; Jun Huan; Latifur Khan; Xiaohua Tony Hu; Ronay Ak; Yuanyuan Tian; Roger Barga; Carlo Zaniolo; Kisung Lee; Yanfang Fanny Ye. Institute of Electrical and Electronics Engineers Inc., 2020. p. 1375-1380 9006487.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - FAE
T2 - 2019 IEEE International Conference on Big Data, Big Data 2019
AU - Iosifidis, Vasileios
AU - Fetahu, Besnik
AU - Ntoutsi, Eirini
N1 - Funding information: This work is part of a project that has received funding from the European Unions Horizon 2020, under the Innovative Training Networks (ITN-ETN) programme Marie Skodowska-Curie grant (NoBIAS-Artificial Intelligence without Bias) agreement no. 860630. The work is also inspired by the Volkswagen Foundation project BIAS (”Bias and Discrimination in Big Data and Algorithmic Processing. Philosophical Assessments, Legal Dimensions, and Technical Solutions”) within the initiative ”AI and the Society of the Future”; the last author is a Project Investigator for both of them.
PY - 2020
Y1 - 2020
N2 - Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such biases resulting in more severe discrimination. The majority of thus far proposed fairness-aware machine learning approaches focus solely on the pre-, in-or post-processing steps of the machine learning process, that is, input data, learning algorithms or derived models, respectively. However, the fairness problem cannot be isolated to a single step of the ML process. Rather, discrimination is often a result of complex interactions between big data and algorithms, and therefore, a more holistic approach is required.The proposed FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre-and post-processing steps of the data analysis process. In the pre-processing step, we tackle the problems of under-representation of the protected group (group imbalance) and of class-imbalance by generating balanced training samples. In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
AB - Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such biases resulting in more severe discrimination. The majority of thus far proposed fairness-aware machine learning approaches focus solely on the pre-, in-or post-processing steps of the machine learning process, that is, input data, learning algorithms or derived models, respectively. However, the fairness problem cannot be isolated to a single step of the ML process. Rather, discrimination is often a result of complex interactions between big data and algorithms, and therefore, a more holistic approach is required.The proposed FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre-and post-processing steps of the data analysis process. In the pre-processing step, we tackle the problems of under-representation of the protected group (group imbalance) and of class-imbalance by generating balanced training samples. In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
KW - class imbalance
KW - class overlap
KW - ensemble learning
KW - fairness-aware classification
KW - group imbalance
UR - http://www.scopus.com/inward/record.url?scp=85081292429&partnerID=8YFLogxK
U2 - 10.1109/BigData47090.2019.9006487
DO - 10.1109/BigData47090.2019.9006487
M3 - Conference contribution
AN - SCOPUS:85081292429
SN - 9781728108599
SP - 1375
EP - 1380
BT - 2019 IEEE International Conference on Big Data (Big Data)
A2 - Baru, Chaitanya
A2 - Huan, Jun
A2 - Khan, Latifur
A2 - Hu, Xiaohua Tony
A2 - Ak, Ronay
A2 - Tian, Yuanyuan
A2 - Barga, Roger
A2 - Zaniolo, Carlo
A2 - Lee, Kisung
A2 - Ye, Yanfang Fanny
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 9 December 2019 through 12 December 2019
ER -