Suppr超能文献

噪声数据的非线性主成分分析

Nonlinear principal component analysis of noisy data.

作者信息

Hsieh William W

机构信息

Department of Earth and Ocean Sciences, University of British Columbia, Vancouver, BC, Canada.

出版信息

Neural Netw. 2007 May;20(4):434-43. doi: 10.1016/j.neunet.2007.04.018. Epub 2007 May 6.

Abstract

With very noisy data, having plentiful samples eliminates overfitting in nonlinear regression, but not in nonlinear principal component analysis (NLPCA). To overcome this problem in NLPCA, a new information criterion (IC) is proposed for selecting the best model among multiple models with different complexity and regularization (i.e. weight penalty). This IC gauges the inconsistency I between the nonlinear principal components (u and ũ) for every data point x and its nearest neighbour x, with I=1 - correlation (u, ũ), where I tends to increase with overfitted solutions. Tests were performed using autoassociative neural networks for NLPCA on synthetic and real climate data (tropical Pacific sea surface temperatures and equatorial stratospheric winds), with the IC performing well in model selection and in deciding between an open curve or a closed curve solution.

摘要

对于噪声非常大的数据,拥有大量样本可以消除非线性回归中的过拟合,但在非线性主成分分析(NLPCA)中却不行。为了在NLPCA中克服这个问题,提出了一种新的信息准则(IC),用于在具有不同复杂度和正则化(即权重惩罚)的多个模型中选择最佳模型。该IC衡量每个数据点x与其最近邻点x的非线性主成分(u和ũ)之间的不一致性I,其中I = 1 - 相关性(u,ũ),I往往会随着过拟合解而增加。使用自联想神经网络对合成和真实气候数据(热带太平洋海表面温度和赤道平流层风)进行NLPCA测试,该IC在模型选择以及确定开放曲线或封闭曲线解方面表现良好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验