• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多元回归分析化学中的内插和外推问题:基于近红外(NIR)光谱数据的稳健性基准测试。

Interpolation and extrapolation problems of multivariate regression in analytical chemistry: benchmarking the robustness on near-infrared (NIR) spectroscopy data.

机构信息

Department of Chemistry and Applied Biosciences, ETH Zurich, 8093 Zurich, Switzerland.

出版信息

Analyst. 2012 Apr 7;137(7):1604-10. doi: 10.1039/c2an15972d. Epub 2012 Feb 16.

DOI:10.1039/c2an15972d
PMID:22337290
Abstract

Modern analytical chemistry of industrial products is in need of rapid, robust, and cheap analytical methods to continuously monitor product quality parameters. For this reason, spectroscopic methods are often used to control the quality of industrial products in an on-line/in-line regime. Vibrational spectroscopy, including mid-infrared (MIR), Raman, and near-infrared (NIR), is one of the best ways to obtain information about the chemical structures and the quality coefficients of multicomponent mixtures. Together with chemometric algorithms and multivariate data analysis (MDA) methods, which were especially created for the analysis of complicated, noisy, and overlapping signals, NIR spectroscopy shows great results in terms of its accuracy, including classical prediction error, RMSEP. However, it is unclear whether the combined NIR + MDA methods are capable of dealing with much more complex interpolation or extrapolation problems that are inevitably present in real-world applications. In the current study, we try to make a rather general comparison of linear, such as partial least squares or projection to latent structures (PLS); "quasi-nonlinear", such as the polynomial version of PLS (Poly-PLS); and intrinsically non-linear, such as artificial neural networks (ANNs), support vector regression (SVR), and least-squares support vector machines (LS-SVM/LSSVM), regression methods in terms of their robustness. As a measure of robustness, we will try to estimate their accuracy when solving interpolation and extrapolation problems. Petroleum and biofuel (biodiesel) systems were chosen as representative examples of real-world samples. Six very different chemical systems that differed in complexity, composition, structure, and properties were studied; these systems were gasoline, ethanol-gasoline biofuel, diesel fuel, aromatic solutions of petroleum macromolecules, petroleum resins in benzene, and biodiesel. Eighteen different sample sets were used in total. General conclusions are made about the applicability of ANN- and SVM-based regression tools in the modern analytical chemistry. The effectiveness of different multivariate algorithms is different when going from classical accuracy to robustness. Neural networks, which are capable of producing very accurate results with respect to classical RMSEP, are not able to solve interpolation problems or, especially, extrapolation problems. The chemometric methods that are based on the support vector machine (SVM) ideology are capable of solving both classical regression and interpolation/extrapolation tasks.

摘要

现代工业产品分析化学需要快速、稳健和廉价的分析方法来连续监测产品质量参数。出于这个原因,光谱方法常用于在线/在线模式下控制工业产品的质量。振动光谱,包括中红外(MIR)、拉曼和近红外(NIR),是获取多组分混合物化学结构和质量系数信息的最佳方法之一。与专门为分析复杂、嘈杂和重叠信号而创建的化学计量学算法和多元数据分析(MDA)方法一起,NIR 光谱在准确性方面表现出色,包括经典预测误差、RMSEP。然而,尚不清楚组合的 NIR+MDA 方法是否能够处理在实际应用中不可避免的更复杂的内插或外推问题。在当前的研究中,我们试图对线性方法(如偏最小二乘法或投影到潜在结构(PLS))、“准非线性”方法(如 PLS 的多项式版本(Poly-PLS))以及本质上的非线性方法(如人工神经网络(ANNs)、支持向量回归(SVR)和最小二乘支持向量机(LS-SVM/LSSVM)进行相当一般的比较,方法是比较它们在稳健性方面的回归能力。作为稳健性的衡量标准,我们将尝试估计它们在解决内插和外推问题时的准确性。石油和生物燃料(生物柴油)系统被选为现实样本的代表性例子。研究了六个非常不同的化学系统,这些系统在复杂性、组成、结构和性质上有所不同;这些系统是汽油、乙醇-汽油生物燃料、柴油燃料、石油大分子的芳烃溶液、苯中的石油树脂和生物柴油。总共使用了 18 个不同的样本集。对基于 ANN 和 SVM 的回归工具在现代分析化学中的适用性得出了一般性结论。不同多元算法的有效性从经典准确性到稳健性有所不同。神经网络能够在经典 RMSEP 方面产生非常准确的结果,但无法解决内插问题,尤其是外推问题。基于支持向量机(SVM)思想的化学计量学方法能够解决经典回归和内插/外推任务。

相似文献

1
Interpolation and extrapolation problems of multivariate regression in analytical chemistry: benchmarking the robustness on near-infrared (NIR) spectroscopy data.多元回归分析化学中的内插和外推问题:基于近红外(NIR)光谱数据的稳健性基准测试。
Analyst. 2012 Apr 7;137(7):1604-10. doi: 10.1039/c2an15972d. Epub 2012 Feb 16.
2
Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.支持向量机回归(SVR/LS-SVM)——分析化学中神经网络(ANN)的替代品?近红外(NIR)光谱数据的非线性方法比较。
Analyst. 2011 Apr 21;136(8):1703-12. doi: 10.1039/c0an00387e. Epub 2011 Feb 25.
3
Variable selection in near-infrared spectroscopy: benchmarking of feature selection methods on biodiesel data.近红外光谱中的变量选择:生物柴油数据特征选择方法的基准测试。
Anal Chim Acta. 2011 Apr 29;692(1-2):63-72. doi: 10.1016/j.aca.2011.03.006. Epub 2011 Mar 8.
4
Simultaneous determination of hydrocarbon renewable diesel, biodiesel and petroleum diesel contents in diesel fuel blends using near infrared (NIR) spectroscopy and chemometrics.利用近红外(NIR)光谱和化学计量学同时测定柴油燃料混合物中的烃可再生柴油、生物柴油和石油柴油含量。
Analyst. 2013 Nov 7;138(21):6477-87. doi: 10.1039/c3an00883e.
5
Biodiesel content determination in diesel fuel blends using near infrared (NIR) spectroscopy and support vector machines (SVM).利用近红外(NIR)光谱和支持向量机(SVM)测定柴油燃料混合物中的生物柴油含量。
Talanta. 2013 Jan 30;104:155-61. doi: 10.1016/j.talanta.2012.11.033. Epub 2012 Nov 23.
6
Melamine detection by mid- and near-infrared (MIR/NIR) spectroscopy: a quick and sensitive method for dairy products analysis including liquid milk, infant formula, and milk powder.中红外/近红外(MIR/NIR)光谱法检测三聚氰胺:一种快速灵敏的乳制品分析方法,包括液态奶、婴儿配方奶粉和奶粉。
Talanta. 2011 Jul 15;85(1):562-8. doi: 10.1016/j.talanta.2011.04.026. Epub 2011 Apr 19.
7
Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?支持向量机回归(LS-SVM)——一种替代人工神经网络(ANNs)的方法,用于分析量子化学数据?
Phys Chem Chem Phys. 2011 Jun 28;13(24):11710-8. doi: 10.1039/c1cp00051a. Epub 2011 May 19.
8
Gasoline classification using near infrared (NIR) spectroscopy data: comparison of multivariate techniques.使用近红外(NIR)光谱数据对汽油进行分类:多元技术比较。
Anal Chim Acta. 2010 Jun 25;671(1-2):27-35. doi: 10.1016/j.aca.2010.05.013. Epub 2010 Jun 1.
9
Biodiesel classification by base stock type (vegetable oil) using near infrared spectroscopy data.基于近红外光谱数据的基础油(植物油)类型的生物柴油分类。
Anal Chim Acta. 2011 Mar 18;689(2):190-7. doi: 10.1016/j.aca.2011.01.041. Epub 2011 Jan 26.
10
Non-linear calibration models for near infrared spectroscopy.近红外光谱的非线性校准模型。
Anal Chim Acta. 2014 Feb 27;813:1-14. doi: 10.1016/j.aca.2013.12.002. Epub 2013 Dec 9.

引用本文的文献

1
Single compound data supplementation to enhance transferability of fermentation specific Raman spectroscopy models.单化合物数据补充以增强发酵特异性拉曼光谱模型的可转移性。
Anal Bioanal Chem. 2025 Apr;417(9):1873-1884. doi: 10.1007/s00216-025-05768-5. Epub 2025 Feb 6.
2
Trait selection strategy in multi-trait GWAS: Boosting SNP discoverability.多性状 GWAS 中的性状选择策略:提高 SNP 可发现性。
HGG Adv. 2024 Jul 18;5(3):100319. doi: 10.1016/j.xhgg.2024.100319. Epub 2024 Jun 13.
3
Determination of physicochemical properties of petroleum derivatives and biodiesel using GC/MS and chemometric methods with uncertainty estimation.
使用气相色谱/质谱联用仪(GC/MS)和化学计量学方法并进行不确定度评估来测定石油衍生物和生物柴油的物理化学性质。
Fuel (Lond). 2019 May 1;243:413-422. doi: 10.1016/j.fuel.2018.12.126.
4
Trait selection strategy in multi-trait GWAS: Boosting SNPs discoverability.多性状全基因组关联研究中的性状选择策略:提高单核苷酸多态性的可发现性
bioRxiv. 2023 Oct 27:2023.10.27.564319. doi: 10.1101/2023.10.27.564319.
5
Iterative machine learning-based chemical similarity search to identify novel chemical inhibitors.基于迭代机器学习的化学相似性搜索以识别新型化学抑制剂。
J Cheminform. 2023 Sep 23;15(1):86. doi: 10.1186/s13321-023-00760-6.
6
Surface-enhanced Raman spectroscopy method for classification of doxycycline hydrochloride and tylosin in duck meat using gold nanoparticles.基于金纳米粒子的表面增强拉曼光谱法用于鸭肉中盐酸多西环素和泰乐菌素的分类
Poult Sci. 2021 Jun;100(6):101165. doi: 10.1016/j.psj.2021.101165. Epub 2021 Mar 27.
7
The Sample, the Spectra and the Maths-The Critical Pillars in the Development of Robust and Sound Applications of Vibrational Spectroscopy.样品、谱图与数学——开发稳健、可靠的振动光谱应用的关键支柱。
Molecules. 2020 Aug 12;25(16):3674. doi: 10.3390/molecules25163674.
8
Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology.灵活的数据修剪可提高基于组学的个体化肿瘤学中全局机器学习方法的性能。
Int J Mol Sci. 2020 Jan 22;21(3):713. doi: 10.3390/ijms21030713.
9
FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier.浮动窗口投影分离器(FloWPS):一种用于支持向量机(SVM)的数据修剪工具,以提高分类器的鲁棒性。
Front Genet. 2019 Jan 15;9:717. doi: 10.3389/fgene.2018.00717. eCollection 2018.