• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分类框架中具有标签噪声和异常值的稳健变量选择:在农业食品光谱数据中的应用。

Robust variable selection in the framework of classification with label noise and outliers: Applications to spectroscopic data in agri-food.

机构信息

Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milan, Italy.

Univ. Lille, CNRS, UMR 8516, LASIRE-Laboratoire avancé de spectroscopie pour les interactions, la réactivité et l'environnement, F-59000, Lille, France.

出版信息

Anal Chim Acta. 2021 Apr 8;1153:338245. doi: 10.1016/j.aca.2021.338245. Epub 2021 Feb 1.

DOI:10.1016/j.aca.2021.338245
PMID:33714445
Abstract

Classification of high-dimensional spectroscopic data is a common task in analytical chemistry. Well-established procedures like support vector machines (SVMs) and partial least squares discriminant analysis (PLS-DA) are the most common methods for tackling this supervised learning problem. Nonetheless, interpretation of these models remains sometimes difficult, and solutions based on feature selection are often adopted as they lead to the automatic identification of the most informative wavelengths. Unfortunately, for some delicate applications like food authenticity, mislabeled and adulterated spectra occur both in the calibration and/or validation sets, with dramatic effects on the model development, its prediction accuracy and robustness. Motivated by these issues, the present paper proposes a robust model-based method that simultaneously performs variable selection, outliers and label noise detection. We demonstrate the effectiveness of our proposal in dealing with three agri-food spectroscopic studies, where several forms of perturbations are considered. Our approach succeeds in diminishing problem complexity, identifying anomalous spectra and attaining competitive predictive accuracy considering a very low number of selected wavelengths.

摘要

高维光谱数据的分类是分析化学中的一项常见任务。支持向量机 (SVM) 和偏最小二乘判别分析 (PLS-DA) 等成熟的方法是解决这种有监督学习问题的最常用方法。然而,这些模型的解释有时仍然很困难,并且通常采用基于特征选择的解决方案,因为它们可以自动识别信息量最大的波长。不幸的是,对于一些像食品真实性这样微妙的应用,在定标和/或验证集中都会出现有错误标签和掺假的光谱,这对模型的开发、其预测准确性和稳健性产生了巨大的影响。鉴于这些问题,本文提出了一种稳健的基于模型的方法,该方法可以同时执行变量选择、异常值和标签噪声检测。我们通过三个农业食品光谱研究来证明我们的方法的有效性,其中考虑了几种形式的扰动。我们的方法成功地降低了问题的复杂性,识别了异常光谱,并在考虑非常少的选择波长的情况下获得了有竞争力的预测准确性。

相似文献

1
Robust variable selection in the framework of classification with label noise and outliers: Applications to spectroscopic data in agri-food.分类框架中具有标签噪声和异常值的稳健变量选择:在农业食品光谱数据中的应用。
Anal Chim Acta. 2021 Apr 8;1153:338245. doi: 10.1016/j.aca.2021.338245. Epub 2021 Feb 1.
2
Comparison of methods for the detection of outliers and associated biomarkers in mislabeled omics data.比较用于检测组学数据中标记错误的异常值和相关生物标志物的方法。
BMC Bioinformatics. 2020 Aug 14;21(1):357. doi: 10.1186/s12859-020-03653-9.
3
[Producing area identification of Letinus edodes using mid-infrared spectroscopy].基于中红外光谱法的香菇产地鉴别研究
Guang Pu Xue Yu Guang Pu Fen Xi. 2014 Mar;34(3):664-7.
4
Identification of Transgenic Soybean Varieties Using Mid-Infrared Spectroscopy.利用中红外光谱法鉴定转基因大豆品种
Guang Pu Xue Yu Guang Pu Fen Xi. 2017 Mar;37(3):760-5.
5
[Quantitative analysis method of natural gas combustion process combining wavelength selection and outlier spectra detection].结合波长选择与异常光谱检测的天然气燃烧过程定量分析方法
Guang Pu Xue Yu Guang Pu Fen Xi. 2012 Oct;32(10):2799-804.
6
Authenticity identification and classification of Rhodiola species in traditional Tibetan medicine based on Fourier transform near-infrared spectroscopy and chemometrics analysis.基于傅里叶变换近红外光谱和化学计量学分析的藏药中红景天属物种的真实性鉴定和分类。
Spectrochim Acta A Mol Biomol Spectrosc. 2018 Nov 5;204:131-140. doi: 10.1016/j.saa.2018.06.004. Epub 2018 Jun 2.
7
Simultaneous wavelength selection and outlier detection in multivariate regression of near-infrared spectra.近红外光谱多元回归中的同步波长选择与异常值检测
Anal Sci. 2005 Feb;21(2):161-6. doi: 10.2116/analsci.21.161.
8
Classification of structurally related commercial contrast media by near infrared spectroscopy.通过近红外光谱法对结构相关的商业造影剂进行分类。
J Pharm Biomed Anal. 2014 Mar;90:148-60. doi: 10.1016/j.jpba.2013.11.033. Epub 2013 Dec 7.
9
[Discrimination of Varieties of Cabbage with Near Infrared Spectra Based on Principal Component Analysis and Successive Projections Algorithm].基于主成分分析和连续投影算法的近红外光谱法鉴别甘蓝品种
Guang Pu Xue Yu Guang Pu Fen Xi. 2016 Nov;36(11):3536-41.
10
[Two-Dimensional Hetero-Spectral Near-Infrared and Mid-Infrared Correlation Spectroscopy for Discrimination Adulterated Milk].用于鉴别掺假牛奶的二维异谱近红外和中红外相关光谱法
Guang Pu Xue Yu Guang Pu Fen Xi. 2015 Aug;35(8):2099-102.

引用本文的文献

1
Raman spectroscopy-based prediction of ofloxacin concentration in solution using a novel loss function and an improved GA-CNN model.基于拉曼光谱的新型损耗函数和改进 GA-CNN 模型预测溶液中氧氟沙星浓度。
BMC Bioinformatics. 2023 Oct 30;24(1):409. doi: 10.1186/s12859-023-05542-3.