Tao Wenbo, Hou Xinli, Sah Adam, Battle Leilani, Chang Remco, Stonebraker Michael
IEEE Trans Vis Comput Graph. 2021 Feb;27(2):401-411. doi: 10.1109/TVCG.2020.3030372. Epub 2021 Jan 28.
Static scatterplots often suffer from the overdraw problem on big datasets where object overlap causes undesirable visual clutter. The use of zooming in scatterplots can help alleviate this problem. With multiple zoom levels, more screen real estate is available, allowing objects to be placed in a less crowded way. We call this type of visualization scalable scatterplot visualizations, or SSV for short. Despite the potential of SSVs, existing systems and toolkits fall short in supporting the authoring of SSVs due to three limitations. First, many systems have limited scalability, assuming that data fits in the memory of one computer. Second, too much developer work, e.g., using custom code to generate mark layouts or render objects, is required. Third, many systems focus on only a small subset of the SSV design space (e.g. supporting a specific type of visual marks). To address these limitations, we have developed Kyrix-S, a system for easy authoring of SSVs at scale. Kyrix-S derives a declarative grammar that enables specification of a variety of SSVs in a few tens of lines of code, based on an existing survey of scatterplot tasks and designs. The declarative grammar is supported by a distributed layout algorithm which automatically places visual marks onto zoom levels. We store data in a multi-node database and use multi-node spatial indexes to achieve interactive browsing of large SSVs. Extensive experiments show that 1) Kyrix-S enables interactive browsing of SSVs of billions of objects, with response times under 500ms and 2) Kyrix-S achieves 4X-9X reduction in specification compared to a state-of-the-art authoring system.
静态散点图在处理大数据集时常常会遇到重叠问题,即对象重叠会导致视觉上的混乱。在散点图中使用缩放功能有助于缓解这一问题。通过多个缩放级别,可以利用更多的屏幕空间,使对象的布局不那么拥挤。我们将这种可视化类型称为可扩展散点图可视化,简称为SSV。尽管SSV具有潜力,但由于三个限制,现有的系统和工具包在支持SSV的创作方面存在不足。首先,许多系统的可扩展性有限,假设数据能装入一台计算机的内存。其次,需要大量的开发工作,例如使用自定义代码来生成标记布局或渲染对象。第三,许多系统只关注SSV设计空间的一小部分(例如支持特定类型的视觉标记)。为了解决这些限制,我们开发了Kyrix-S,这是一个用于大规模轻松创作SSV的系统。Kyrix-S基于现有的散点图任务和设计调查,推导了一种声明式语法,该语法能够在几十行代码中指定各种SSV。声明式语法由一种分布式布局算法支持,该算法会自动将视觉标记放置到各个缩放级别上。我们将数据存储在多节点数据库中,并使用多节点空间索引来实现对大型SSV的交互式浏览。大量实验表明:1)Kyrix-S能够对数十亿对象的SSV进行交互式浏览,响应时间在500毫秒以内;2)与最先进的创作系统相比,Kyrix-S在规格说明方面减少了4到9倍。