Meng Kun, Eloyan Ani
Department of Biostatistics, Brown University School of Public Health, Providence, RI 02903, USA.
J R Stat Soc Series B Stat Methodol. 2021 Apr;83(2):369-394. doi: 10.1111/rssb.12416. Epub 2021 Mar 24.
We propose a framework of principal manifolds to model high-dimensional data. This framework is based on Sobolev spaces and designed to model data of any intrinsic dimension. It includes principal component analysis and principal curve algorithm as special cases. We propose a novel method for model complexity selection to avoid overfitting, eliminate the effects of outliers, and improve the computation speed. Additionally, we propose a method for identifying the interiors of circle-like curves and cylinder/ball-like surfaces. The proposed approach is compared to existing methods by simulations and applied to estimate tumor surfaces and interiors in a lung cancer study.
我们提出了一个主流形框架来对高维数据进行建模。该框架基于索伯列夫空间,旨在对任何内在维度的数据进行建模。它包括主成分分析和主曲线算法作为特殊情况。我们提出了一种用于模型复杂度选择的新方法,以避免过拟合、消除异常值的影响并提高计算速度。此外,我们提出了一种识别类圆曲线和圆柱/球样表面内部的方法。通过模拟将所提出的方法与现有方法进行比较,并将其应用于肺癌研究中估计肿瘤表面和内部。