IEEE Trans Vis Comput Graph. 2018 Mar;24(3):1301-1315. doi: 10.1109/TVCG.2017.2661309. Epub 2017 Jan 30.
Parallel coordinates plots (PCPs) are a well-studied technique for exploring multi-attribute datasets. In many situations, users find them a flexible method to analyze and interact with data. Unfortunately, using PCPs becomes challenging as the number of data items grows large or multiple trends within the data mix in the visualization. The resulting overdraw can obscure important features. A number of modifications to PCPs have been proposed, including using color, opacity, smooth curves, frequency, density, and animation to mitigate this problem. However, these modified PCPs tend to have their own limitations in the kinds of relationships they emphasize. We propose a new data scalable design for representing and exploring data relationships in PCPs. The approach exploits the point/line duality property of PCPs and a local linear assumption of data to extract and represent relationship summarizations. This approach simultaneously shows relationships in the data and the consistency of those relationships. Our approach supports various visualization tasks, including mixed linear and nonlinear pattern identification, noise detection, and outlier detection, all in large data. We demonstrate these tasks on multiple synthetic and real-world datasets.
平行坐标图(PCP)是一种研究充分的多属性数据集探索技术。在许多情况下,用户发现它们是一种灵活的方法,可以分析和与数据交互。然而,当数据项的数量增长很大或数据中的多个趋势在可视化中混合时,使用 PCP 就变得具有挑战性。由此产生的过度绘制会掩盖重要的特征。已经提出了许多 PCP 的修改,包括使用颜色、不透明度、平滑曲线、频率、密度和动画来减轻这个问题。然而,这些修改后的 PCP 在它们强调的关系类型上往往有其自身的局限性。我们提出了一种新的数据可扩展设计,用于表示和探索 PCP 中的数据关系。该方法利用 PCP 的点/线对偶性和数据的局部线性假设来提取和表示关系概括。这种方法同时显示了数据中的关系和这些关系的一致性。我们的方法支持各种可视化任务,包括混合线性和非线性模式识别、噪声检测和异常值检测,所有这些都在大数据中进行。我们在多个合成和真实数据集上演示了这些任务。