Wang Arran Zeyu, Borland David, Peck Tabitha C, Wang Wenyuan, Gotz David
IEEE Trans Vis Comput Graph. 2025 Jan;31(1):765-775. doi: 10.1109/TVCG.2024.3456381. Epub 2024 Nov 28.
"Correlation does not imply causation" is a famous mantra in statistical and visual analysis. However, consumers of visualizations often draw causal conclusions when only correlations between variables are shown. In this paper, we investigate factors that contribute to causal relationships users perceive in visualizations. We collected a corpus of concept pairs from variables in widely used datasets and created visualizations that depict varying correlative associations using three typical statistical chart types. We conducted two MTurk studies on (1) preconceived notions on causal relations without charts, and (2) perceived causal relations with charts, for each concept pair. Our results indicate that people make assumptions about causal relationships between pairs of concepts even without seeing any visualized data. Moreover, our results suggest that these assumptions constitute causal priors that, in combination with visualized association, impact how data visualizations are interpreted. The results also suggest that causal priors may lead to over- or under-estimation in perceived causal relations in different circumstances, and that those priors can also impact users' confidence in their causal assessments. In addition, our results align with prior work, indicating that chart type may also affect causal inference. Using data from the studies, we develop a model to capture the interaction between causal priors and visualized associations as they combine to impact a user's perceived causal relations. In addition to reporting the study results and analyses, we provide an open dataset of causal priors for 56 specific concept pairs that can serve as a potential benchmark for future studies. We also suggest remaining challenges and heuristic-based guidelines to help designers improve visualization design choices to better support visual causal inference.
“相关性并不意味着因果关系”是统计分析和视觉分析领域的一句著名口头禅。然而,可视化的使用者在仅看到变量之间的相关性时,往往会得出因果结论。在本文中,我们研究了导致用户在可视化中感知到因果关系的因素。我们从广泛使用的数据集中的变量收集了一组概念对,并使用三种典型的统计图表类型创建了描绘不同相关关联的可视化。我们针对每个概念对进行了两项亚马逊土耳其机器人(MTurk)研究,一项是关于(1)无图表时对因果关系的先入之见,另一项是关于(2)有图表时感知到的因果关系。我们的结果表明,即使没有看到任何可视化数据,人们也会对概念对之间的因果关系做出假设。此外,我们的结果表明,这些假设构成了因果先验,它们与可视化关联相结合,会影响数据可视化的解读方式。结果还表明,因果先验在不同情况下可能导致对感知到的因果关系的高估或低估,并且这些先验也会影响用户对其因果评估的信心。此外,我们的结果与先前的研究一致,表明图表类型也可能影响因果推断。利用研究中的数据,我们开发了一个模型来捕捉因果先验和可视化关联之间的相互作用,因为它们共同影响用户感知到的因果关系。除了报告研究结果和分析外,我们还提供了一个包含56个特定概念对因果先验的开放数据集,可作为未来研究的潜在基准。我们还提出了尚存的挑战以及基于启发式的指导方针,以帮助设计师改进可视化设计选择,从而更好地支持视觉因果推断。