TY - GEN
T1 - On Detecting Biased Predictions with Post-hoc Explanation Methods
AU - Ruggeri, Matteo
AU - Dethise, Alice
AU - Canini, Marco
N1 - Publisher Copyright:
© 2023 Association for Computing Machinery.
PY - 2023/12/8
Y1 - 2023/12/8
N2 - We develop a methodology for the analysis of machine learning (ML) models to detect and understand biased decisions and apply it to two specific scenarios. In particular, we show how analyzing model predictions across the dataset, comparing models trained on different subsets of the original data, and applying model-agnostic post-hoc explanation tools can help identify bias in a model in general as well as in specific instances. Further, we consider several definitions of bias and fairness, and show how each provides a different interpretation of the model decisions. Our results show that the analysis of models through the lens of statistical analysis and post-hoc explanations helps to detect and understand bias. We also observe that post-hoc explanations often fail to detect individual biased instances, and caution against using this category of tools to guarantee model fairness. Finally, we provide insights on how this analysis can help understand the origin and shape of bias.
AB - We develop a methodology for the analysis of machine learning (ML) models to detect and understand biased decisions and apply it to two specific scenarios. In particular, we show how analyzing model predictions across the dataset, comparing models trained on different subsets of the original data, and applying model-agnostic post-hoc explanation tools can help identify bias in a model in general as well as in specific instances. Further, we consider several definitions of bias and fairness, and show how each provides a different interpretation of the model decisions. Our results show that the analysis of models through the lens of statistical analysis and post-hoc explanations helps to detect and understand bias. We also observe that post-hoc explanations often fail to detect individual biased instances, and caution against using this category of tools to guarantee model fairness. Finally, we provide insights on how this analysis can help understand the origin and shape of bias.
KW - Explainable Machine Learning
KW - Feature Analysis
KW - Post-hoc Explanations
UR - http://www.scopus.com/inward/record.url?scp=85180533381&partnerID=8YFLogxK
U2 - 10.1145/3630050.3630179
DO - 10.1145/3630050.3630179
M3 - Conference contribution
AN - SCOPUS:85180533381
T3 - SAFE 2023 - Proceedings of the 2023 Explainable and Safety Bounded, Fidelitous, Machine Learning for Networking
SP - 17
EP - 23
BT - SAFE 2023 - Proceedings of the 2023 Explainable and Safety Bounded, Fidelitous, Machine Learning for Networking
PB - Association for Computing Machinery, Inc
T2 - 2023 Explainable and Safety Bounded, Fidelitous, Machine Learning for Networking, SAFE 2023
Y2 - 8 December 2023
ER -