Wang Y, Luo L, Freedman M T, Kung S Y
Department of Electrical Engineering and Computer Science, The Catholic University of America,Washington, DC 20064, USA.
IEEE Trans Neural Netw. 2000;11(3):625-36. doi: 10.1109/72.846734.
Visual exploration has proven to be a powerful tool for multivariate data mining and knowledge discovery. Most visualization algorithms aim to find a projection from the data space down to a visually perceivable rendering space. To reveal all of the interesting aspects of multimodal data sets living in a high-dimensional space, a hierarchical visualization algorithm is introduced which allows the complete data set to be visualized at the top level, with clusters and subclusters of data points visualized at deeper levels. The methods involve hierarchical use of standard finite normal mixtures and probabilistic principal component projections, whose parameters are estimated using the expectation-maximization and principal component neural networks under the information theoretic criteria.We demonstrate the principle of the approach on several multimodal numerical data sets, and we then apply the method to the visual explanation in computer-aided diagnosis for breast cancer detection from digital mammograms.
可视化探索已被证明是多变量数据挖掘和知识发现的有力工具。大多数可视化算法旨在找到从数据空间到视觉可感知渲染空间的投影。为了揭示生活在高维空间中的多模态数据集的所有有趣方面,引入了一种分层可视化算法,该算法允许在顶层可视化完整数据集,在更深层次可视化数据点的聚类和子聚类。这些方法涉及标准有限正态混合和概率主成分投影的分层使用,其参数在信息理论标准下使用期望最大化和主成分神经网络进行估计。我们在几个多模态数值数据集上演示了该方法的原理,然后将该方法应用于从数字乳腺X线照片中检测乳腺癌的计算机辅助诊断中的视觉解释。