University of Utah, USA.
IEEE Trans Vis Comput Graph. 2010 Nov-Dec;16(6):1271-80. doi: 10.1109/TVCG.2010.213.
An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, for instance, in large numerical simulations, as energy landscapes in optimization problems, or in the analysis of image data relating to biological or medical parameters. This paper proposes an approach to analyze and visualizing such data sets. The proposed method combines topological and geometric techniques to provide interactive visualizations of discretely sampled high-dimensional scalar fields. The method relies on a segmentation of the parameter space using an approximate Morse-Smale complex on the cloud of point samples. For each crystal of the Morse-Smale complex, a regression of the system parameters with respect to the output yields a curve in the parameter space. The result is a simplified geometric representation of the Morse-Smale complex in the high dimensional input domain. Finally, the geometric representation is embedded in 2D, using dimension reduction, to provide a visualization platform. The geometric properties of the regression curves enable the visualization of additional information about each crystal such as local and global shape, width, length, and sampling densities. The method is illustrated on several synthetic examples of two dimensional functions. Two use cases, using data sets from the UCI machine learning repository, demonstrate the utility of the proposed approach on real data. Finally, in collaboration with domain experts the proposed method is applied to two scientific challenges. The analysis of parameters of climate simulations and their relationship to predicted global energy flux and the concentrations of chemical species in a combustion simulation and their integration with temperature.
科学数据分析的一个重要目标是基于系统的样本了解系统或过程的行为。在许多情况下,既可以观察输入参数又可以观察系统输出,并将系统描述为一个高维函数。这种数据集例如出现在大型数值模拟中,作为优化问题中的能量景观,或者在与生物或医学参数相关的图像数据分析中。本文提出了一种分析和可视化此类数据集的方法。所提出的方法结合了拓扑和几何技术,为离散采样的高维标量场提供交互式可视化。该方法依赖于使用点样本云中的近似 Morse-Smale 复形对参数空间进行分割。对于 Morse-Smale 复形的每个晶体,对系统参数相对于输出的回归会在参数空间中产生一条曲线。结果是 Morse-Smale 复形在高维输入域中的简化几何表示。最后,使用维度降低将几何表示嵌入到 2D 中,以提供可视化平台。回归曲线的几何性质使得能够可视化每个晶体的附加信息,例如局部和全局形状、宽度、长度和采样密度。该方法在二维函数的几个合成示例上进行了说明。使用 UCI 机器学习存储库中的数据集的两个用例证明了所提出方法在真实数据上的实用性。最后,与领域专家合作,将所提出的方法应用于两个科学挑战。分析气候模拟的参数及其与预测的全球能量通量的关系,以及在燃烧模拟中化学物质的浓度及其与温度的关系。