Department of Geography, Pennsylvania State University, PA, USA.
IEEE Trans Vis Comput Graph. 2009 Nov-Dec;15(6):889-96. doi: 10.1109/TVCG.2009.130.
A dendrogram that visualizes a clustering hierarchy is often integrated with a reorderable matrix for pattern identification. The method is widely used in many research fields including biology, geography, statistics, and data mining. However, most dendrograms do not scale up well, particularly with respect to problems of graphical and cognitive information overload. This research proposes a strategy that links an overview dendrogram and a detail-view dendrogram, each integrated with a reorderable matrix. The overview displays only a user-controlled, limited number of nodes that represent the ""skeleton" of a hierarchy. The detail view displays the sub-tree represented by a selected meta-node in the overview. The research presented here focuses on constructing a concise overview dendrogram and its coordination with a detail view. The proposed method has the following benefits: dramatic alleviation of information overload, enhanced scalability and data abstraction quality on the dendrogram, and the support of data exploration at arbitrary levels of detail. The contribution of the paper includes a new metric to measure the "importance" of nodes in a dendrogram; the method to construct the concise overview dendrogram from the dynamically-identified, important nodes; and measure for evaluating the data abstraction quality for dendrograms. We evaluate and compare the proposed method to some related existing methods, and demonstrating how the proposed method can help users find interesting patterns through a case study on county-level U.S. cervical cancer mortality and demographic data.
一个可视化聚类层次结构的谱系图通常与可重新排序的矩阵集成在一起,以进行模式识别。该方法广泛应用于生物学、地理学、统计学和数据挖掘等多个研究领域。然而,大多数谱系图的扩展效果不佳,特别是在图形和认知信息过载方面。本研究提出了一种策略,将概述谱系图和详细视图谱系图链接起来,每个谱系图都集成了可重新排序的矩阵。概述图仅显示用户控制的、数量有限的节点,这些节点代表层次结构的“骨架”。详细视图显示由概述中选择的元节点表示的子树。这里介绍的研究重点是构建简洁的概述谱系图及其与详细视图的协调。所提出的方法具有以下优点:极大地减轻了信息过载,提高了谱系图的可扩展性和数据抽象质量,并支持在任意详细级别进行数据探索。本文的贡献包括一种新的度量标准,用于衡量谱系图中节点的“重要性”;从动态识别的重要节点构建简洁概述谱系图的方法;以及用于评估谱系图数据抽象质量的度量标准。我们评估并比较了所提出的方法与一些相关的现有方法,并通过对美国县级宫颈癌死亡率和人口统计数据的案例研究,展示了所提出的方法如何帮助用户发现有趣的模式。