Gerber Samuel, Rübel Oliver, Bremer Peer-Timo, Pascucci Valerio, Whitaker Ross T
University of Utah.
J Comput Graph Stat. 2013 Jan 1;22(1):193-214. doi: 10.1080/10618600.2012.657132.
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package , an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study.
本文介绍了一种结合拓扑信息的基于划分的新型回归方法。基于划分的回归通常会引入一种由拟合质量驱动的域分解。这项工作的重点在于具有拓扑意义的分割。因此,所提出的回归方法基于由莫尔斯 - 斯马尔复形的离散近似诱导的分割。这产生了一种分割,其划分对应于函数具有单个最小值和最大值的区域,这些区域通常可以由线性模型很好地近似。这种方法产生的回归模型易于解释且具有良好的预测能力。通常,回归估计通过其几何精度来量化。对于所提出的回归,一个重要方面是分割本身的质量。因此,本文引入了一种衡量估计拓扑精度的新准则。拓扑精度为经典几何误差度量提供了一种补充度量,并且对过拟合非常敏感。在几何和拓扑方面,将莫尔斯 - 斯马尔回归与当前的先进方法进行了比较,并在许多情况下产生了可比或更好的拟合。最后,对气候模拟数据的详细研究展示了莫尔斯 - 斯马尔回归的应用。补充材料可在线获取,其中包含在R包中所提出方法的实现、对莫尔斯 - 斯马尔复形近似稳定性的分析和模拟以及气候模拟研究的附加表格。