Automated Feature Engineering for Algorithmic Fairness.

Research output: Contribution to journalConference articleResearchpeer review

Authors

  • Ricardo Salazar
  • Felix Neutatz
  • Ziawasch Abedjan

External Research Organisations

  • Technische Universität Berlin
View graph of relations

Details

Original languageEnglish
Pages (from-to)1694-1702
Number of pages9
JournalProceedings of the VLDB Endowment
Volume14
Issue number9
Early online date22 Oct 2020
Publication statusPublished - May 2021

Abstract

One of the fundamental problems of machine ethics is to avoid the perpetuation and amplification of discrimination through machine learning applications. In particular, it is desired to exclude the influence of attributes with sensitive information, such as gender or race, and other causally related attributes on the machine learning task. The state-of-the-art bias reduction algorithm Capuchin breaks the causality chain of such attributes by adding and removing tuples. However, this horizontal approach can be considered invasive because it changes the data distribution. A vertical approach would be to prune sensitive features entirely. While this would ensure fairness without tampering with the data, it could also hurt the machine learning accuracy. Therefore, we propose a novel multi-objective feature selection strategy that leverages feature construction to generate more features that lead to both high accuracy and fairness. On three well-known datasets, our system achieves higher accuracy than other fairness-aware approaches while maintaining similar or higher fairness.

ASJC Scopus subject areas

Cite this

Automated Feature Engineering for Algorithmic Fairness. / Salazar, Ricardo; Neutatz, Felix; Abedjan, Ziawasch.
In: Proceedings of the VLDB Endowment, Vol. 14, No. 9, 05.2021, p. 1694-1702.

Research output: Contribution to journalConference articleResearchpeer review

Salazar, R, Neutatz, F & Abedjan, Z 2021, 'Automated Feature Engineering for Algorithmic Fairness.', Proceedings of the VLDB Endowment, vol. 14, no. 9, pp. 1694-1702. https://doi.org/10.14778/3461535.3463474
Salazar, R., Neutatz, F., & Abedjan, Z. (2021). Automated Feature Engineering for Algorithmic Fairness. Proceedings of the VLDB Endowment, 14(9), 1694-1702. https://doi.org/10.14778/3461535.3463474
Salazar R, Neutatz F, Abedjan Z. Automated Feature Engineering for Algorithmic Fairness. Proceedings of the VLDB Endowment. 2021 May;14(9):1694-1702. Epub 2020 Oct 22. doi: 10.14778/3461535.3463474
Salazar, Ricardo ; Neutatz, Felix ; Abedjan, Ziawasch. / Automated Feature Engineering for Algorithmic Fairness. In: Proceedings of the VLDB Endowment. 2021 ; Vol. 14, No. 9. pp. 1694-1702.
Download
@article{a9b716dccad94b2c8443fa75d75db541,
title = "Automated Feature Engineering for Algorithmic Fairness.",
abstract = "One of the fundamental problems of machine ethics is to avoid the perpetuation and amplification of discrimination through machine learning applications. In particular, it is desired to exclude the influence of attributes with sensitive information, such as gender or race, and other causally related attributes on the machine learning task. The state-of-the-art bias reduction algorithm Capuchin breaks the causality chain of such attributes by adding and removing tuples. However, this horizontal approach can be considered invasive because it changes the data distribution. A vertical approach would be to prune sensitive features entirely. While this would ensure fairness without tampering with the data, it could also hurt the machine learning accuracy. Therefore, we propose a novel multi-objective feature selection strategy that leverages feature construction to generate more features that lead to both high accuracy and fairness. On three well-known datasets, our system achieves higher accuracy than other fairness-aware approaches while maintaining similar or higher fairness.",
author = "Ricardo Salazar and Felix Neutatz and Ziawasch Abedjan",
note = "Funding Information: The contribution of Felix Neutatz was funded by the German Ministry for Education and Research as BIFOLD - Berlin Institute for the Foundations of Learning and Data (ref. 01IS18025A and ref. 01IS18037A). We thank Babak Salimi for providing us with their source code.",
year = "2021",
month = may,
doi = "10.14778/3461535.3463474",
language = "English",
volume = "14",
pages = "1694--1702",
number = "9",

}

Download

TY - JOUR

T1 - Automated Feature Engineering for Algorithmic Fairness.

AU - Salazar, Ricardo

AU - Neutatz, Felix

AU - Abedjan, Ziawasch

N1 - Funding Information: The contribution of Felix Neutatz was funded by the German Ministry for Education and Research as BIFOLD - Berlin Institute for the Foundations of Learning and Data (ref. 01IS18025A and ref. 01IS18037A). We thank Babak Salimi for providing us with their source code.

PY - 2021/5

Y1 - 2021/5

N2 - One of the fundamental problems of machine ethics is to avoid the perpetuation and amplification of discrimination through machine learning applications. In particular, it is desired to exclude the influence of attributes with sensitive information, such as gender or race, and other causally related attributes on the machine learning task. The state-of-the-art bias reduction algorithm Capuchin breaks the causality chain of such attributes by adding and removing tuples. However, this horizontal approach can be considered invasive because it changes the data distribution. A vertical approach would be to prune sensitive features entirely. While this would ensure fairness without tampering with the data, it could also hurt the machine learning accuracy. Therefore, we propose a novel multi-objective feature selection strategy that leverages feature construction to generate more features that lead to both high accuracy and fairness. On three well-known datasets, our system achieves higher accuracy than other fairness-aware approaches while maintaining similar or higher fairness.

AB - One of the fundamental problems of machine ethics is to avoid the perpetuation and amplification of discrimination through machine learning applications. In particular, it is desired to exclude the influence of attributes with sensitive information, such as gender or race, and other causally related attributes on the machine learning task. The state-of-the-art bias reduction algorithm Capuchin breaks the causality chain of such attributes by adding and removing tuples. However, this horizontal approach can be considered invasive because it changes the data distribution. A vertical approach would be to prune sensitive features entirely. While this would ensure fairness without tampering with the data, it could also hurt the machine learning accuracy. Therefore, we propose a novel multi-objective feature selection strategy that leverages feature construction to generate more features that lead to both high accuracy and fairness. On three well-known datasets, our system achieves higher accuracy than other fairness-aware approaches while maintaining similar or higher fairness.

UR - http://www.scopus.com/inward/record.url?scp=85115160562&partnerID=8YFLogxK

U2 - 10.14778/3461535.3463474

DO - 10.14778/3461535.3463474

M3 - Conference article

VL - 14

SP - 1694

EP - 1702

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

IS - 9

ER -