A survey on datasets for fairness-aware machine learning

Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, Eirini Ntoutsi – 2022

As decision-making increasingly relies on Machine Learning (ML) and (big) data, the issue of fairness in data-driven Artificial Intelligence (AI) systems is receiving increasing attention from both research and industry. A large variety of fairness-aware machine learning solutions have been proposed which involve fairness-related interventions in the data, learning algorithms and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real-world datasets used for fairness-aware machine learning. We focus on tabular data as the most common data representation for fairness-aware machine learning. We start our analysis by identifying relationships between the different attributes, particularly w.r.t. protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate the interesting relationships using exploratory analysis.

Titel

Verfasser

Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, Eirini Ntoutsi

Schlagwörter

datasets, fairness-aware machine learning

Datum

2022-01-25

Kennung

DOI: 10.1002/widm.1452

Quelle/n

Erschienen in

"WIREs Data Mining and Knowledge Discovery", Vol. 12, Issue 3; May/June 2022.

Sprache

eng

Fachbereich Mathematik und Informatik

Künstliche Intelligenz und Maschinelles Lernen

A survey on datasets for fairness-aware machine learning

Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, Eirini Ntoutsi – 2022