Correll Michael, Gleicher Michael
IEEE Trans Vis Comput Graph. 2014 Dec;20(12):2142-51. doi: 10.1109/TVCG.2014.2346298.
When making an inference or comparison with uncertain, noisy, or incomplete data, measurement error and confidence intervals can be as important for judgment as the actual mean values of different groups. These often misunderstood statistical quantities are frequently represented by bar charts with error bars. This paper investigates drawbacks with this standard encoding, and considers a set of alternatives designed to more effectively communicate the implications of mean and error data to a general audience, drawing from lessons learned from the use of visual statistics in the information visualization community. We present a series of crowd-sourced experiments that confirm that the encoding of mean and error significantly changes how viewers make decisions about uncertain data. Careful consideration of design tradeoffs in the visual presentation of data results in human reasoning that is more consistently aligned with statistical inferences. We suggest the use of gradient plots (which use transparency to encode uncertainty) and violin plots (which use width) as better alternatives for inferential tasks than bar charts with error bars.
在使用不确定、有噪声或不完整的数据进行推断或比较时,测量误差和置信区间对于判断的重要性可能与不同组的实际均值相当。这些常常被误解的统计量通常用带有误差线的柱状图来表示。本文研究了这种标准编码方式的缺点,并从信息可视化社区中视觉统计的使用经验中汲取教训,考虑了一组旨在更有效地向普通受众传达均值和误差数据含义的替代方法。我们进行了一系列众包实验,证实均值和误差的编码显著改变了观众对不确定数据的决策方式。在数据的视觉呈现中仔细考虑设计权衡会导致人类推理更一致地与统计推断相符。我们建议,对于推理任务,使用梯度图(利用透明度编码不确定性)和小提琴图(利用宽度)比带有误差线的柱状图是更好的选择。