Rave Hennes, Molchanov Vladimir, Linsen Lars
IEEE Trans Vis Comput Graph. 2025 Apr;31(4):2114-2126. doi: 10.1109/TVCG.2024.3381453. Epub 2025 Feb 27.
Scatterplots provide a visual representation of bivariate data (or 2D embeddings of multivariate data) that allows for effective analyses of data dependencies, clusters, trends, and outliers. Unfortunately, classical scatterplots suffer from scalability issues, since growing data sizes eventually lead to overplotting and visual clutter on a screen with a fixed resolution, which hinders the data analysis process. We propose an algorithm that compensates for irregular sample distributions by a smooth transformation of the scatterplot's visual domain. Our algorithm evaluates the scatterplot's density distribution to compute a regularization mapping based on integral images of the rasterized density function. The mapping preserves the samples' neighborhood relations. Few regularization iterations suffice to achieve a nearly uniform sample distribution that efficiently uses the available screen space. We further propose approaches to visually convey the transformation that was applied to the scatterplot and compare them in a user study. We present a novel parallel algorithm for fast GPU-based integral-image computation, which allows for integrating our de-cluttering approach into interactive visual data analysis systems.
散点图提供了双变量数据(或多变量数据的二维嵌入)的可视化表示,能够有效地分析数据依赖性、聚类、趋势和异常值。不幸的是,传统散点图存在可扩展性问题,因为随着数据量的增加,最终会在固定分辨率的屏幕上导致重叠绘图和视觉混乱,这会阻碍数据分析过程。我们提出了一种算法,通过对散点图的视觉域进行平滑变换来补偿不规则的样本分布。我们的算法评估散点图的密度分布,以基于光栅化密度函数的积分图像计算正则化映射。该映射保留了样本的邻域关系。只需进行几次正则化迭代,就足以实现几乎均匀的样本分布,从而有效地利用可用的屏幕空间。我们还提出了一些方法来直观地传达应用于散点图的变换,并在用户研究中对它们进行比较。我们提出了一种新颖的并行算法,用于基于GPU的快速积分图像计算,该算法允许将我们的去噪方法集成到交互式视觉数据分析系统中。