Tian Jinhua, Xie Hailun, Hu Siyuan, Liu Jia
Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China.
Department of Psychology & Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China.
Front Comput Neurosci. 2021 Mar 10;15:620281. doi: 10.3389/fncom.2021.620281. eCollection 2021.
The increasingly popular application of AI runs the risk of amplifying social bias, such as classifying non-white faces as animals. Recent research has largely attributed this bias to the training data implemented. However, the underlying mechanism is poorly understood; therefore, strategies to rectify the bias are unresolved. Here, we examined a typical deep convolutional neural network (DCNN), VGG-Face, which was trained with a face dataset consisting of more white faces than black and Asian faces. The transfer learning result showed significantly better performance in identifying white faces, similar to the well-known social bias in humans, the other-race effect (ORE). To test whether the effect resulted from the imbalance of face images, we retrained the VGG-Face with a dataset containing more Asian faces, and found a reverse ORE that the newly-trained VGG-Face preferred Asian faces over white faces in identification accuracy. Additionally, when the number of Asian faces and white faces were matched in the dataset, the DCNN did not show any bias. To further examine how imbalanced image input led to the ORE, we performed a representational similarity analysis on VGG-Face's activation. We found that when the dataset contained more white faces, the representation of white faces was more distinct, indexed by smaller in-group similarity and larger representational Euclidean distance. That is, white faces were scattered more sparsely in the representational face space of the VGG-Face than the other faces. Importantly, the distinctiveness of faces was positively correlated with identification accuracy, which explained the ORE observed in the VGG-Face. In summary, our study revealed the mechanism underlying the ORE in DCNNs, which provides a novel approach to studying AI ethics. In addition, the face multidimensional representation theory discovered in humans was also applicable to DCNNs, advocating for future studies to apply more cognitive theories to understand DCNNs' behavior.
人工智能日益广泛的应用存在加剧社会偏见的风险,比如将非白人面孔归类为动物。近期研究大多将这种偏见归因于所使用的训练数据。然而,其潜在机制却知之甚少;因此,纠正这种偏见的策略也尚未解决。在此,我们研究了一个典型的深度卷积神经网络(DCNN),即VGG - Face,它是用一个面部数据集训练的,其中白人面孔多于黑人和亚洲面孔。迁移学习结果显示,在识别白人面孔方面表现显著更好,这类似于人类中著名的社会偏见——异族效应(ORE)。为了测试这种效应是否由面部图像的不平衡导致,我们用一个包含更多亚洲面孔的数据集对VGG - Face进行重新训练,结果发现了一种反向的ORE,即新训练的VGG - Face在识别准确性上更倾向于亚洲面孔而非白人面孔。此外,当数据集中亚洲面孔和白人面孔数量相匹配时,深度卷积神经网络没有表现出任何偏见。为了进一步研究不平衡的图像输入是如何导致ORE的,我们对VGG - Face的激活进行了表征相似性分析。我们发现,当数据集中包含更多白人面孔时,白人面孔的表征更加独特,其指标是组内相似性更小和表征欧几里得距离更大。也就是说,在VGG - Face的表征面部空间中,白人面孔比其他面孔分布得更稀疏。重要的是,面孔的独特性与识别准确性呈正相关,这解释了在VGG - Face中观察到的ORE。总之,我们的研究揭示了深度卷积神经网络中ORE的潜在机制,这为研究人工智能伦理提供了一种新方法。此外,在人类中发现的面部多维表征理论也适用于深度卷积神经网络,倡导未来的研究应用更多认知理论来理解深度卷积神经网络的行为。