Lawonn Kai, Meuschke Monique, Eulzer Pepe, Mitterreiter Matthias, Giesen Joachim, Gunther Tobias
IEEE Trans Vis Comput Graph. 2023 Jan;29(1):526-536. doi: 10.1109/TVCG.2022.3209374. Epub 2022 Dec 16.
The Gaussian mixture model (GMM) describes the distribution of random variables from several different populations. GMMs have widespread applications in probability theory, statistics, machine learning for unsupervised cluster analysis and topic modeling, as well as in deep learning pipelines. So far, few efforts have been made to explore the underlying point distribution in combination with the GMMs, in particular when the data becomes high-dimensional and when the GMMs are composed of many Gaussians. We present an analysis tool comprising various GPU-based visualization techniques to explore such complex GMMs. To facilitate the exploration of high-dimensional data, we provide a novel navigation system to analyze the underlying data. Instead of projecting the data to 2D, we utilize interactive 3D views to better support users in understanding the spatial arrangements of the Gaussian distributions. The interactive system is composed of two parts: (1) raycasting-based views that visualize cluster memberships, spatial arrangements, and support the discovery of new modes. (2) overview visualizations that enable the comparison of Gaussians with each other, as well as small multiples of different choices of basis vectors. Users are supported in their exploration with customization tools and smooth camera navigations. Our tool was developed and assessed by five domain experts, and its usefulness was evaluated with 23 participants. To demonstrate the effectiveness, we identify interesting features in several data sets.
高斯混合模型(GMM)描述了来自几个不同总体的随机变量的分布。高斯混合模型在概率论、统计学、用于无监督聚类分析和主题建模的机器学习以及深度学习管道中都有广泛应用。到目前为止,很少有人努力结合高斯混合模型探索潜在的点分布,特别是当数据变得高维以及高斯混合模型由许多高斯分布组成时。我们提出了一种分析工具,它包含各种基于GPU的可视化技术来探索此类复杂的高斯混合模型。为了便于探索高维数据,我们提供了一种新颖的导航系统来分析潜在数据。我们不是将数据投影到二维,而是利用交互式三维视图来更好地支持用户理解高斯分布的空间排列。交互式系统由两部分组成:(1)基于光线投射的视图,用于可视化聚类成员、空间排列并支持新模式的发现。(2)概述可视化,用于实现高斯分布之间的相互比较以及不同基向量选择的小倍数展示。通过定制工具和平滑的相机导航来支持用户进行探索。我们的工具由五位领域专家开发和评估,并让23名参与者评估了其有用性。为了证明其有效性,我们在几个数据集中识别出了有趣的特征。