Suppr超能文献

利用光谱数据中正交偏最小二乘回归向量的分布进行特征选择。

Feature selection using distributions of orthogonal PLS regression vectors in spectral data.

作者信息

Lee Geonseok, Lee Kichun

机构信息

Industrial Engineering, Hanyang University, Seoul, Korea.

出版信息

BioData Min. 2021 Jan 22;14(1):7. doi: 10.1186/s13040-021-00240-3.

Abstract

Feature selection, which is important for successful analysis of chemometric data, aims to produce parsimonious and predictive models. Partial least squares (PLS) regression is one of the main methods in chemometrics for analyzing multivariate data with input X and response Y by modeling the covariance structure in the X and Y spaces. Recently, orthogonal projections to latent structures (OPLS) has been widely used in processing multivariate data because OPLS improves the interpretability of PLS models by removing systematic variation in the X space not correlated to Y. The purpose of this paper is to present a feature selection method of multivariate data through orthogonal PLS regression (OPLSR), which combines orthogonal signal correction with PLS. The presented method generates empirical distributions of features effects upon Y in OPLSR vectors via permutation tests and examines the significance of the effects of the input features on Y. We show the performance of the proposed method using a simulation study in which a three-layer network structure exists in compared with the false discovery rate method. To demonstrate this method, we apply it to both real-life NIR spectra data and mass spectrometry data.

摘要

特征选择对于化学计量学数据的成功分析至关重要,其目的是生成简约且具有预测性的模型。偏最小二乘(PLS)回归是化学计量学中用于分析多变量数据的主要方法之一,通过对X和Y空间中的协方差结构进行建模来处理输入X和响应Y。最近,正交投影到潜在结构(OPLS)已广泛应用于多变量数据处理,因为OPLS通过去除X空间中与Y不相关的系统变化来提高PLS模型的可解释性。本文的目的是提出一种通过正交PLS回归(OPLSR)进行多变量数据特征选择的方法,该方法将正交信号校正与PLS相结合。所提出的方法通过置换检验生成OPLSR向量中特征对Y影响的经验分布,并检验输入特征对Y影响的显著性。我们通过模拟研究展示了所提出方法的性能,其中存在三层网络结构,并与错误发现率方法进行了比较。为了演示该方法,我们将其应用于实际的近红外光谱数据和质谱数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/2d5f6921bec5/13040_2021_240_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验