Details
Original language | English |
---|---|
Pages (from-to) | 96821-96847 |
Number of pages | 27 |
Journal | IEEE ACCESS |
Volume | 12 |
Publication status | Published - 12 Jul 2024 |
Abstract
Artificial Intelligence (AI) systems can introduce biases that lead to unreliable outcomes and, in the worst-case scenarios, perpetuate systemic and discriminatory results when deployed in the real world. While significant efforts have been made to create bias detection methods, developing reliable and comprehensive documentation artifacts also makes for valuable resources that address bias and aid in minimizing the harms associated with AI systems. Based on compositional design patterns, this paper introduces a documentation approach using a hybrid AI system to prompt the identification and traceability of bias in datasets and predictive AI models. To demonstrate the effectiveness of our approach, we instantiate our pattern in two implementations of a hybrid AI system. One follows an integrated approach and performs fine-grained tracing and documentation of the AI model. In contrast, the other hybrid system follows a principled approach and enables the documentation and comparison of bias in the input data and the predictions generated by the model. Through a use-case based on Fake News detection and an empirical evaluation, we show how biases detected during data ingestion steps (e.g., label, over-representation, activity bias) affect the training and predictions of the classification models. Concretely, we report a stark skewness in the distribution of input variables towards the Fake News label, we uncover how a predictive variable leads to more constraints in the learning process, and highlight open challenges of training models with unbalanced datasets. A video summarizing this work is available online (https://youtu.be/v2GfIQPAy_4?si=BXtWOf97cLiZavyu),and the implementation is publicly available on GitHub (https://github.com/SDM-TIB/DocBiasKG).
Keywords
- Bias, hybrid AI systems, knowledge graphs, tracing
ASJC Scopus subject areas
- Computer Science(all)
- General Computer Science
- Materials Science(all)
- General Materials Science
- Engineering(all)
- General Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: IEEE ACCESS, Vol. 12, 12.07.2024, p. 96821-96847.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Employing Hybrid AI Systems to Trace and Document Bias in ML Pipelines
AU - Russo, Mayra
AU - Chudasama, Yasharajsinh
AU - Purohit, Disha
AU - Sawischa, Sammy
AU - Vidal, Maria Esther
N1 - Publisher Copyright: © 2013 IEEE.
PY - 2024/7/12
Y1 - 2024/7/12
N2 - Artificial Intelligence (AI) systems can introduce biases that lead to unreliable outcomes and, in the worst-case scenarios, perpetuate systemic and discriminatory results when deployed in the real world. While significant efforts have been made to create bias detection methods, developing reliable and comprehensive documentation artifacts also makes for valuable resources that address bias and aid in minimizing the harms associated with AI systems. Based on compositional design patterns, this paper introduces a documentation approach using a hybrid AI system to prompt the identification and traceability of bias in datasets and predictive AI models. To demonstrate the effectiveness of our approach, we instantiate our pattern in two implementations of a hybrid AI system. One follows an integrated approach and performs fine-grained tracing and documentation of the AI model. In contrast, the other hybrid system follows a principled approach and enables the documentation and comparison of bias in the input data and the predictions generated by the model. Through a use-case based on Fake News detection and an empirical evaluation, we show how biases detected during data ingestion steps (e.g., label, over-representation, activity bias) affect the training and predictions of the classification models. Concretely, we report a stark skewness in the distribution of input variables towards the Fake News label, we uncover how a predictive variable leads to more constraints in the learning process, and highlight open challenges of training models with unbalanced datasets. A video summarizing this work is available online (https://youtu.be/v2GfIQPAy_4?si=BXtWOf97cLiZavyu),and the implementation is publicly available on GitHub (https://github.com/SDM-TIB/DocBiasKG).
AB - Artificial Intelligence (AI) systems can introduce biases that lead to unreliable outcomes and, in the worst-case scenarios, perpetuate systemic and discriminatory results when deployed in the real world. While significant efforts have been made to create bias detection methods, developing reliable and comprehensive documentation artifacts also makes for valuable resources that address bias and aid in minimizing the harms associated with AI systems. Based on compositional design patterns, this paper introduces a documentation approach using a hybrid AI system to prompt the identification and traceability of bias in datasets and predictive AI models. To demonstrate the effectiveness of our approach, we instantiate our pattern in two implementations of a hybrid AI system. One follows an integrated approach and performs fine-grained tracing and documentation of the AI model. In contrast, the other hybrid system follows a principled approach and enables the documentation and comparison of bias in the input data and the predictions generated by the model. Through a use-case based on Fake News detection and an empirical evaluation, we show how biases detected during data ingestion steps (e.g., label, over-representation, activity bias) affect the training and predictions of the classification models. Concretely, we report a stark skewness in the distribution of input variables towards the Fake News label, we uncover how a predictive variable leads to more constraints in the learning process, and highlight open challenges of training models with unbalanced datasets. A video summarizing this work is available online (https://youtu.be/v2GfIQPAy_4?si=BXtWOf97cLiZavyu),and the implementation is publicly available on GitHub (https://github.com/SDM-TIB/DocBiasKG).
KW - Bias
KW - hybrid AI systems
KW - knowledge graphs
KW - tracing
UR - http://www.scopus.com/inward/record.url?scp=85199265384&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3427388
DO - 10.1109/ACCESS.2024.3427388
M3 - Article
AN - SCOPUS:85199265384
VL - 12
SP - 96821
EP - 96847
JO - IEEE ACCESS
JF - IEEE ACCESS
SN - 2169-3536
ER -