• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用光谱数据中正交偏最小二乘回归向量的分布进行特征选择。

Feature selection using distributions of orthogonal PLS regression vectors in spectral data.

作者信息

Lee Geonseok, Lee Kichun

机构信息

Industrial Engineering, Hanyang University, Seoul, Korea.

出版信息

BioData Min. 2021 Jan 22;14(1):7. doi: 10.1186/s13040-021-00240-3.

DOI:10.1186/s13040-021-00240-3
PMID:33482872
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7821640/
Abstract

Feature selection, which is important for successful analysis of chemometric data, aims to produce parsimonious and predictive models. Partial least squares (PLS) regression is one of the main methods in chemometrics for analyzing multivariate data with input X and response Y by modeling the covariance structure in the X and Y spaces. Recently, orthogonal projections to latent structures (OPLS) has been widely used in processing multivariate data because OPLS improves the interpretability of PLS models by removing systematic variation in the X space not correlated to Y. The purpose of this paper is to present a feature selection method of multivariate data through orthogonal PLS regression (OPLSR), which combines orthogonal signal correction with PLS. The presented method generates empirical distributions of features effects upon Y in OPLSR vectors via permutation tests and examines the significance of the effects of the input features on Y. We show the performance of the proposed method using a simulation study in which a three-layer network structure exists in compared with the false discovery rate method. To demonstrate this method, we apply it to both real-life NIR spectra data and mass spectrometry data.

摘要

特征选择对于化学计量学数据的成功分析至关重要,其目的是生成简约且具有预测性的模型。偏最小二乘(PLS)回归是化学计量学中用于分析多变量数据的主要方法之一,通过对X和Y空间中的协方差结构进行建模来处理输入X和响应Y。最近,正交投影到潜在结构(OPLS)已广泛应用于多变量数据处理,因为OPLS通过去除X空间中与Y不相关的系统变化来提高PLS模型的可解释性。本文的目的是提出一种通过正交PLS回归(OPLSR)进行多变量数据特征选择的方法,该方法将正交信号校正与PLS相结合。所提出的方法通过置换检验生成OPLSR向量中特征对Y影响的经验分布,并检验输入特征对Y影响的显著性。我们通过模拟研究展示了所提出方法的性能,其中存在三层网络结构,并与错误发现率方法进行了比较。为了演示该方法,我们将其应用于实际的近红外光谱数据和质谱数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/525579dab0c7/13040_2021_240_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/2d5f6921bec5/13040_2021_240_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/b98f34688699/13040_2021_240_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/dbb264b28145/13040_2021_240_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/525579dab0c7/13040_2021_240_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/2d5f6921bec5/13040_2021_240_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/b98f34688699/13040_2021_240_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/dbb264b28145/13040_2021_240_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78f1/7821640/525579dab0c7/13040_2021_240_Fig4_HTML.jpg

相似文献

1
Feature selection using distributions of orthogonal PLS regression vectors in spectral data.利用光谱数据中正交偏最小二乘回归向量的分布进行特征选择。
BioData Min. 2021 Jan 22;14(1):7. doi: 10.1186/s13040-021-00240-3.
2
[Quantitative analysis of electronic absorption spectroscopy by piecewise orthogonal signal correction and partial least square].基于分段正交信号校正和偏最小二乘法的电子吸收光谱定量分析
Guang Pu Xue Yu Guang Pu Fen Xi. 2008 Apr;28(4):860-4.
3
Chemometrics-assisted simultaneous voltammetric determination of ascorbic acid, uric acid, dopamine and nitrite: application of non-bilinear voltammetric data for exploiting first-order advantage.化学计量学辅助同时伏安法测定抗坏血酸、尿酸、多巴胺和亚硝酸盐:利用非双线性伏安数据发挥一阶优势的应用
Talanta. 2014 Feb;119:553-63. doi: 10.1016/j.talanta.2013.11.028. Epub 2013 Nov 27.
4
Controlling the False Discovery Rate for Feature Selection in High-resolution NMR Spectra.控制高分辨率核磁共振谱中特征选择的错误发现率
Stat Anal Data Min. 2008 Jun;1(2):57-66. doi: 10.1002/sam.10005.
5
A Sequential Algorithm for Multiblock Orthogonal Projections to Latent Structures.一种用于多块正交投影到潜在结构的序列算法。
Chemometr Intell Lab Syst. 2015 Dec 15;149(Pt B):33-39. doi: 10.1016/j.chemolab.2015.10.018.
6
A Comparative Investigation of the Combined Effects of Pre-Processing, Wavelength Selection, and Regression Methods on Near-Infrared Calibration Model Performance.预处理、波长选择和回归方法对近红外校准模型性能的联合效应的比较研究
Appl Spectrosc. 2017 Jul;71(7):1432-1446. doi: 10.1177/0003702817694623. Epub 2017 Mar 30.
7
Removal of interference signals due to water from in vivo near-infrared (NIR) spectra of blood glucose by region orthogonal signal correction (ROSC).通过区域正交信号校正(ROSC)去除由于水导致的干扰信号,这些干扰信号来自血糖的体内近红外(NIR)光谱。
Anal Sci. 2004 Sep;20(9):1339-45. doi: 10.2116/analsci.20.1339.
8
Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data.基于滤波-包装法和输入尺度变换的近红外光谱数据的稳健波长选择
Sensors (Basel). 2020 Sep 3;20(17):5001. doi: 10.3390/s20175001.
9
Combining orthogonal signal correction and wavelet packet transform with partial least squares to analyze overlapping spectra of three kinds of metal ions.结合正交信号校正、小波包变换与偏最小二乘法分析三种金属离子的重叠光谱。
Spectrochim Acta A Mol Biomol Spectrosc. 2009 Sep 1;73(5):960-5. doi: 10.1016/j.saa.2009.05.008. Epub 2009 May 22.
10
Orthogonal PLS (OPLS) Modeling for Improved Analysis and Interpretation in Drug Design.用于药物设计中改进分析与解释的正交偏最小二乘法(OPLS)建模
Mol Inform. 2012 Jul;31(6-7):414-9. doi: 10.1002/minf.201200158. Epub 2012 Apr 16.

引用本文的文献

1
Monitoring quality changes in green tea during storage: A hyperspectral imaging method.绿茶储存期间品质变化监测:一种高光谱成像方法。
Food Chem X. 2024 Jun 8;23:101538. doi: 10.1016/j.fochx.2024.101538. eCollection 2024 Oct 30.
2
Current Application of Advancing Spectroscopy Techniques in Food Analysis: Data Handling with Chemometric Approaches.光谱技术在食品分析中的当前应用:采用化学计量学方法进行数据处理
Foods. 2023 Jul 19;12(14):2753. doi: 10.3390/foods12142753.
3
Rapid determination of triglyceride and glucose levels in induced by high-sugar or high-fat diets based on near-infrared spectroscopy.

本文引用的文献

1
Factors associated with sufficient knowledge of antibiotics and antimicrobial resistance in the Japanese general population.与日本普通人群对抗生素和抗菌药物耐药性的充分认识相关的因素。
Sci Rep. 2020 Feb 26;10(1):3502. doi: 10.1038/s41598-020-60444-1.
2
A biplot correlation range for group-wise metabolite selection in mass spectrometry.质谱中用于分组代谢物选择的双标图相关范围。
BioData Min. 2019 Feb 4;12:4. doi: 10.1186/s13040-019-0191-2. eCollection 2019.
3
Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival.
基于近红外光谱法快速测定高糖或高脂饮食诱导的甘油三酯和葡萄糖水平。
Heliyon. 2023 Jun 20;9(6):e17389. doi: 10.1016/j.heliyon.2023.e17389. eCollection 2023 Jun.
随机森林回归、分类和生存中变量重要性的标准误差和置信区间。
Stat Med. 2019 Feb 20;38(4):558-582. doi: 10.1002/sim.7803. Epub 2018 Jun 4.
4
Bile acid aspiration associated with lung chemical profile linked to other biomarkers of injury after lung transplantation.胆汁酸吸入与肺化学特征相关,后者与肺移植后其他损伤生物标志物相关。
Am J Transplant. 2014 Apr;14(4):841-8. doi: 10.1111/ajt.12631. Epub 2014 Feb 19.
5
Nutritional metabolomics: progress in addressing complexity in diet and health.营养代谢组学:在解决饮食与健康复杂性方面的进展。
Annu Rev Nutr. 2012 Aug 21;32:183-202. doi: 10.1146/annurev-nutr-072610-145159. Epub 2012 Apr 23.
6
Influence of temperature on vibrational spectra and consequences for the predictive ability of multivariate models.温度对振动光谱的影响以及对多变量模型预测能力的影响。
Anal Chem. 1998 May 1;70(9):1761-7. doi: 10.1021/ac9709920.
7
Controlling the False Discovery Rate for Feature Selection in High-resolution NMR Spectra.控制高分辨率核磁共振谱中特征选择的错误发现率
Stat Anal Data Min. 2008 Jun;1(2):57-66. doi: 10.1002/sam.10005.
8
Chemometric contributions to the evolution of metabonomics: mathematical solutions to characterising and interpreting complex biological NMR spectra.化学计量学对代谢组学发展的贡献:表征和解释复杂生物核磁共振谱的数学解决方案。
Analyst. 2002 Dec;127(12):1549-57. doi: 10.1039/b208254n.
9
Wavelength interval selection in multicomponent spectral analysis by moving window partial least-squares regression with applications to mid-infrared and near-infrared spectroscopic data.基于移动窗口偏最小二乘回归的多组分光谱分析中的波长区间选择及其在中红外和近红外光谱数据中的应用
Anal Chem. 2002 Jul 15;74(14):3555-65. doi: 10.1021/ac011177u.