Adeli Ehsan, Zhao Qingyu, Pfefferbaum Adolf, Sullivan Edith V, Fei-Fei Li, Niebles Juan Carlos, Pohl Kilian M
Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305.
Department of Computer Science, Stanford University, CA 94305.
IEEE Winter Conf Appl Comput Vis. 2021 Jan;2021:2512-2522. doi: 10.1109/wacv48630.2021.00256. Epub 2021 Jun 14.
Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years. Such challenges range from spurious associations between variables in medical studies to the bias of race in gender or face recognition systems. Controlling for all types of biases in the dataset curation stage is cumbersome and sometimes impossible. The alternative is to use the available data and build models incorporating fair representation learning. In this paper, we propose such a model based on adversarial training with two competing objectives to learn features that have (1) maximum discriminative power with respect to the task and (2) minimal statistical mean dependence with the protected (bias) variable(s). Our approach does so by incorporating a new adversarial loss function that encourages a vanished correlation between the bias and the learned features. We apply our method to synthetic data, medical images (containing task bias), and a dataset for gender classification (containing dataset bias). Our results show that the learned features by our method not only result in superior prediction performance but also are unbiased.
(数据集中或任务中的)偏差的存在无疑是机器学习应用中最关键的挑战之一,近年来引发了关键辩论。此类挑战涵盖从医学研究中变量之间的虚假关联到性别或面部识别系统中的种族偏差等。在数据集整理阶段控制所有类型的偏差既繁琐又有时不可能。另一种方法是使用可用数据并构建包含公平表示学习的模型。在本文中,我们提出了这样一种基于对抗训练的模型,该模型具有两个相互竞争的目标,以学习具有以下特征的特征:(1)相对于任务具有最大判别力,(2)与受保护(偏差)变量的统计平均依赖性最小。我们的方法通过纳入一种新的对抗损失函数来实现这一点,该函数鼓励偏差与学习到的特征之间的相关性消失。我们将我们的方法应用于合成数据、医学图像(包含任务偏差)和一个性别分类数据集(包含数据集偏差)。我们的结果表明,我们的方法学习到的特征不仅带来卓越的预测性能,而且是无偏差的。