Suppr超能文献

一种用于振动光谱数据分析的集成变量选择方法。

An ensemble variable selection method for vibrational spectroscopic data analysis.

作者信息

Zhang Jixiong, Yan Hong, Xiong Yanmei, Li Qianqian, Min Shungeng

机构信息

College of Science, China Agricultural University No. 2, Yuanmingyuanxi Road, Haidian District Beijing 100193 P.R. China

School of Marine Science, China University of Geosciences in Beijing Beijing 100086 China.

出版信息

RSC Adv. 2019 Feb 26;9(12):6708-6716. doi: 10.1039/c8ra08754g. eCollection 2019 Feb 22.

Abstract

Wavelength selection is a critical factor for pattern recognition of vibrational spectroscopic data. Not only does it alleviate the effect of dimensionality on an algorithm's generalization performance, but it also enhances the understanding and interpretability of multivariate classification models. In this study, a novel partial least squares discriminant analysis (PLSDA)-based wavelength selection algorithm, termed ensemble of bootstrapping space shrinkage (EBSS), has been devised for vibrational spectroscopic data analysis. In the algorithm, a set of subsets are generated from a data set using random sampling. For an individual subset, a feature space is determined by maximizing the expected 10-fold cross-validation accuracy with a weighted bootstrap sampling strategy. Then an ensemble strategy and a sequential forward selection method are applied to the feature spaces to select characteristic variables. Experimental results obtained from analysis of real vibrational spectroscopic data sets demonstrate that the ensemble wavelength selection algorithm can reserve stable and informative variables for the final modeling and improve predictive ability for multivariate classification models.

摘要

波长选择是振动光谱数据模式识别的关键因素。它不仅能减轻维度对算法泛化性能的影响,还能增强多元分类模型的可理解性和可解释性。在本研究中,一种基于偏最小二乘判别分析(PLSDA)的新型波长选择算法——自展空间收缩集成算法(EBSS)被设计用于振动光谱数据分析。在该算法中,通过随机抽样从数据集中生成一组子集。对于单个子集,采用加权自展抽样策略,通过最大化期望的10折交叉验证准确率来确定特征空间。然后将集成策略和顺序向前选择方法应用于这些特征空间以选择特征变量。对实际振动光谱数据集的分析结果表明,该集成波长选择算法能够为最终建模保留稳定且信息丰富的变量,并提高多元分类模型的预测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0ce/9087301/8ec8f539e982/c8ra08754g-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验