Sandia National Laboratories, USA.
IEEE Trans Vis Comput Graph. 2011 Dec;17(12):1822-31. doi: 10.1109/TVCG.2011.199.
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion science; however, it is applicable to many other science domains.
我们提出了一个新的框架,用于对大规模科学数据进行基于特征的统计分析,并通过分析湍流燃烧的直接数值模拟(DNS)中的特征来证明其有效性。湍流无处不在,它们在燃烧、天体物理学、聚变和气候建模等领域的传输和混合过程中发挥作用。它们还具有相干结构或组织运动的特点,即非局部实体,其几何特征可以直接影响分子混合和反应过程。虽然传统的多点统计提供了相关信息,但它们缺乏非局部结构信息,因此无法提供组织流体运动与混合和反应过程之间的机械因果关系信息。因此,捕捉和跟踪流动特征及其与相关标量(如温度或物种浓度)的相关性,并跟踪它们的统计信息,这一点非常重要。在我们的方法中,我们通过预先计算带有属性的合并树来对所有可能的流特征进行编码,例如各种标量场(如温度)的统计矩以及通过谱分析计算的长度尺度。计算以高效的流方式在预处理步骤中执行,结果是元数据的集合,其大小比原始模拟数据小几个数量级。该元数据足以支持对特征的完全灵活和交互式分析,允许任意阈值,提供每个特征的统计信息,并创建各种全局诊断,例如累积分布函数(CDF)、直方图或时间序列。我们将分析与特征的链接视图浏览器中的呈现相结合,使科学家能够交互式地探索、可视化和分析相当于 1TB 的模拟数据。我们强调了这个新框架在燃烧科学中的实用性;然而,它也适用于许多其他科学领域。