Cui Zhenxing, Chen Lu, Wang Yunhai, Haehn Daniel, Wang Yong, Pfister Hanspeter
IEEE Trans Vis Comput Graph. 2025 Sep;31(9):5611-5625. doi: 10.1109/TVCG.2024.3463800.
This article presents a systematic study of the generalization of convolutional neural networks (CNNs) and humans on relational reasoning tasks with bar charts. We first revisit previous experiments on graphical perception and update the benchmark performance of CNNs. We then test the generalization performance of CNNs on a classic relational reasoning task: estimating bar length ratios in a bar chart, by progressively perturbing the standard visualizations. We further conduct a user study to compare the performance of CNNs and humans. Our results show that CNNs outperform humans only when the training and test data have the same visual encodings. Otherwise, they may perform worse. We also find that CNNs are sensitive to perturbations in various visual encodings, regardless of their relevance to the target bars. Yet, humans are mainly influenced by bar lengths. Our study suggests that robust relational reasoning with visualizations is challenging for CNNs. Improving CNNs' generalization performance may require training them to better recognize task-related visual properties.
本文对卷积神经网络(CNN)和人类在使用柱状图的关系推理任务上的泛化能力进行了系统研究。我们首先回顾了之前关于图形感知的实验,并更新了CNN的基准性能。然后,我们通过逐步扰动标准可视化,在一个经典的关系推理任务上测试CNN的泛化性能:估计柱状图中的柱长比例。我们进一步进行了一项用户研究,以比较CNN和人类的性能。我们的结果表明,只有当训练数据和测试数据具有相同的视觉编码时,CNN的表现才会优于人类。否则,它们的表现可能会更差。我们还发现,CNN对各种视觉编码中的扰动很敏感,无论这些扰动与目标柱体是否相关。然而,人类主要受柱长的影响。我们的研究表明,对CNN来说,通过可视化进行稳健的关系推理具有挑战性。提高CNN的泛化性能可能需要训练它们更好地识别与任务相关的视觉属性。