IEEE Trans Vis Comput Graph. 2017 Aug;23(8):1977-1987. doi: 10.1109/TVCG.2016.2607714. Epub 2016 Sep 9.
The stated goal for visual data exploration is to operate at a rate that matches the pace of human data analysts, but the ever increasing amount of data has led to a fundamental problem: datasets are often too large to process within interactive time frames. Progressive analytics and visualizations have been proposed as potential solutions to this issue. By processing data incrementally in small chunks, progressive systems provide approximate query answers at interactive speeds that are then refined over time with increasing precision. We study how progressive visualizations affect users in exploratory settings in an experiment where we capture user behavior and knowledge discovery through interaction logs and think-aloud protocols. Our experiment includes three visualization conditions and different simulated dataset sizes. The visualization conditions are: (1) blocking, where results are displayed only after the entire dataset has been processed; (2) instantaneous, a hypothetical condition where results are shown almost immediately; and (3) progressive, where approximate results are displayed quickly and then refined over time. We analyze the data collected in our experiment and observe that users perform equally well with either instantaneous or progressive visualizations in key metrics, such as insight discovery rates and dataset coverage, while blocking visualizations have detrimental effects.
可视化数据探索的既定目标是达到与人类数据分析师相匹配的速度,但不断增加的数据量导致了一个根本性的问题:数据集通常太大,无法在交互时间范围内处理。渐进式分析和可视化已被提议作为解决此问题的潜在方法。通过以小块增量处理数据,渐进式系统以交互速度提供近似查询答案,然后随着时间的推移逐渐提高精度。我们通过交互日志和思维 aloud 协议,在一项实验中研究了渐进式可视化在探索性环境中如何影响用户。我们的实验包括三种可视化条件和不同的模拟数据集大小。可视化条件为:(1)阻塞,仅在处理完整个数据集后显示结果;(2)即时,一种假设的条件,其中结果几乎立即显示;(3)渐进,其中快速显示近似结果,然后随着时间的推移进行细化。我们分析了实验中收集的数据,观察到在关键指标(如洞察发现率和数据集覆盖率)方面,即时或渐进式可视化的用户表现相当,而阻塞式可视化则会产生不利影响。