使用深度学习评估色谱数据中的峰。

Using deep learning to evaluate peaks in chromatographic data.

作者信息

Risum Anne Bech, Bro Rasmus

机构信息

Department of Food Science, University of Copenhagen, Denmark.

出版信息

Talanta. 2019 Nov 1;204:255-260. doi: 10.1016/j.talanta.2019.05.053. Epub 2019 May 22.

DOI:10.1016/j.talanta.2019.05.053

PMID:31357290

Abstract

Analysis of untargeted gas-chromatographic data is time consuming. With the earlier introduction of the PARAFAC2 (PARAllel FACtor analysis 2) based PARADISe (PARAFAC2 based Deconvolution and Identification System) approach in 2017, this task was made considerably more time-efficient. However, there are still a number of manual steps in the analysis which require data analytical expertise. One of these is the need to define whether or not each PARAFAC2 resolved component represents a peak suitable for integration. As the peaks may change in both shape and location on the elution time-axis, this presents a problem which cannot be readily solved by applying a linear classifier, such as PLS-DA (Partial Least Squares regression for Discriminant Analysis). As part of our ongoing efforts to further automate analysis of Gas Chromatography with Mass Spectrometry (GC-MS), we therefore explore a convolutional neural network classifier, capable of handling these shifts and variations in shape. The theory of convolutional neural networks and application on vector samples is briefly explained, and the performance is tested against a PLS-DA classifier, a shallow artificial neural network and a locally weighted regression model. The models are built on a training set with PARAFAC2 resolved components from eight different aroma related GC-MS runs with a total of over 70,000 elution profile samples, and validated using another, independent, GC-MS dataset. Based on Receiver Operating Characteristic curves (ROC) and manual analysis of the misclassified cases, it is shown that the convolutional network consistently outperforms the competing models, yielding an Area Under the Curve (AUC) value of 0.95 for peak classification. Examples are given illustrating that this new approach provides convincing means to automatically assess and evaluate modelled elution profiles of chromatographic data and thereby remove this laborious manual step.

摘要

非靶向气相色谱数据的分析耗时较长。随着2017年基于PARAFAC2（平行因子分析2）的PARADISe（基于PARAFAC2的去卷积和识别系统）方法的较早引入，这项任务的时间效率有了显著提高。然而，分析过程中仍有许多手动步骤，需要数据分析专业知识。其中之一是需要确定每个PARAFAC2解析的组分是否代表适合积分的峰。由于峰在洗脱时间轴上的形状和位置可能会发生变化，这就带来了一个问题，即应用线性分类器（如PLS-DA，偏最小二乘判别分析）无法轻易解决。因此，作为我们进一步实现气相色谱-质谱联用（GC-MS）分析自动化的持续努力的一部分，我们探索了一种卷积神经网络分类器，它能够处理这些形状上的变化和偏移。简要解释了卷积神经网络的理论及其在向量样本上的应用，并与PLS-DA分类器、浅层人工神经网络和局部加权回归模型进行了性能测试。这些模型基于一个训练集构建，该训练集包含来自八个不同的与香气相关的GC-MS运行的PARAFAC2解析组分，共有超过70000个洗脱谱样本，并使用另一个独立的GC-MS数据集进行验证。基于受试者工作特征曲线（ROC）以及对误分类案例的人工分析，结果表明卷积网络始终优于竞争模型，在峰分类方面的曲线下面积（AUC）值为0.95。文中给出了示例，说明这种新方法为自动评估和评价色谱数据的模拟洗脱谱提供了令人信服的手段，从而消除了这一繁琐的手动步骤。

相似文献

Using deep learning to evaluate peaks in chromatographic data.使用深度学习评估色谱数据中的峰。

Talanta. 2019 Nov 1;204:255-260. doi: 10.1016/j.talanta.2019.05.053. Epub 2019 May 22.

Plant metabolomics: resolution and quantification of elusive peaks in liquid chromatography-mass spectrometry profiles of complex plant extracts using multi-way decomposition methods.植物代谢组学：利用多向分解方法解析和定量复杂植物提取物液相色谱-质谱图谱中的难以捉摸的峰。

J Chromatogr A. 2012 Nov 30;1266:84-94. doi: 10.1016/j.chroma.2012.10.023. Epub 2012 Oct 16.

Untargeted Metabolomic Profile for the Detection of Prostate Carcinoma-Preliminary Results from PARAFAC2 and PLS-DA Models.基于 PARAFAC2 和 PLS-DA 模型的前列腺癌非靶向代谢组学特征：初步研究结果

Molecules. 2019 Aug 22;24(17):3063. doi: 10.3390/molecules24173063.

Automated pipeline for classifying Aroclors in soil by gas chromatography/mass spectrometry using modulo compressed two-way data objects.使用模压缩双向数据对象对土壤中的 Aroclors 进行气相色谱/质谱分类的自动化流水线。

Talanta. 2013 Dec 15;117:483-91. doi: 10.1016/j.talanta.2013.09.050. Epub 2013 Oct 7.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

Fully automatic resolution of untargeted GC-MS data with deep learning assistance.借助深度学习辅助实现非靶向气相色谱-质谱数据的全自动解析

Talanta. 2022 Jul 1;244:123415. doi: 10.1016/j.talanta.2022.123415. Epub 2022 Mar 26.

Gas chromatography - mass spectrometry data processing made easy.气相色谱-质谱数据处理变得轻松。

J Chromatogr A. 2017 Jun 23;1503:57-64. doi: 10.1016/j.chroma.2017.04.052. Epub 2017 Apr 27.

Comprehensive analysis of chromatographic data by using PARAFAC2 and principal components analysis.应用 PARAFAC2 和主成分分析对色谱数据进行综合分析。

J Chromatogr A. 2010 Jun 25;1217(26):4422-9. doi: 10.1016/j.chroma.2010.04.042. Epub 2010 Apr 22.

Deep learning framework for peak detection at the intact level of therapeutic proteins.治疗性蛋白质完整水平的峰检测深度学习框架。

J Sep Sci. 2024 Jun;47(11):e2400051. doi: 10.1002/jssc.202400051.

Classification of weathered petroleum oils by multi-way analysis of gas chromatography-mass spectrometry data using PARAFAC2 parallel factor analysis.利用PARAFAC2平行因子分析通过气相色谱-质谱数据的多向分析对风化石油进行分类。

J Chromatogr A. 2007 Sep 28;1166(1-2):163-70. doi: 10.1016/j.chroma.2007.07.085. Epub 2007 Aug 9.

引用本文的文献

Artificial Intelligence in Natural Product Drug Discovery: Current Applications and Future Perspectives.天然产物药物发现中的人工智能：当前应用与未来展望。

J Med Chem. 2025 Feb 27;68(4):3948-3969. doi: 10.1021/acs.jmedchem.4c01257. Epub 2025 Feb 6.

From multi-omics to predictive biomarker: AI in tumor microenvironment.从多组学到预测性生物标志物：肿瘤微环境中的人工智能

Front Immunol. 2024 Dec 23;15:1514977. doi: 10.3389/fimmu.2024.1514977. eCollection 2024.

Exploring the Effect of Different Storage Conditions on the Aroma Profile of Bread by Using Arrow-SPME GC-MS and Chemometrics.采用顶空固相微萃取 GC-MS 和化学计量学方法研究不同储存条件对面包香气特征的影响。

Molecules. 2023 Apr 20;28(8):3587. doi: 10.3390/molecules28083587.

Development of Non-Targeted Mass Spectrometry Method for Distinguishing Spelt and Wheat.用于区分斯佩尔特小麦和普通小麦的非靶向质谱法的开发

Foods. 2022 Dec 27;12(1):141. doi: 10.3390/foods12010141.

An actionable annotation scoring framework for gas chromatography-high-resolution mass spectrometry.一种用于气相色谱-高分辨率质谱的可操作注释评分框架。

Exposome. 2022 Aug 25;2(1):osac007. doi: 10.1093/exposome/osac007. eCollection 2022.

The Chemistry of Green and Roasted Coffee by Selectable 1D/2D Gas Chromatography Mass Spectrometry with Spectral Deconvolution.选择 1D/2D 气相色谱质谱联用与光谱解卷积法研究绿色和烘焙咖啡的化学成分。

Molecules. 2022 Aug 21;27(16):5328. doi: 10.3390/molecules27165328.

Unveiling Chemical Cues of Insect-Tree and Insect-Insect Interactions for the Eucalyptus Weevil and Its Egg Parasitoid by Multidimensional Gas Chromatographic Methods.多维气相色谱法揭示桉树象及其卵寄生蜂的昆虫-树和昆虫-昆虫相互作用的化学线索。

Molecules. 2022 Jun 23;27(13):4042. doi: 10.3390/molecules27134042.

Natural product drug discovery in the artificial intelligence era.人工智能时代的天然产物药物发现

Chem Sci. 2021 Dec 13;13(6):1526-1546. doi: 10.1039/d1sc04471k. eCollection 2022 Feb 9.

Prediction of the performance of pre-packed purification columns through machine learning.通过机器学习预测预装纯化柱的性能。

J Sep Sci. 2022 Apr;45(8):1445-1457. doi: 10.1002/jssc.202100864. Epub 2022 Mar 20.

Managing of Unassigned Mass Spectrometric Data by Neural Network for Cancer Phenotypes Classification.通过神经网络管理未分配的质谱数据用于癌症表型分类

J Pers Med. 2021 Dec 3;11(12):1288. doi: 10.3390/jpm11121288.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用深度学习评估色谱数据中的峰。

Using deep learning to evaluate peaks in chromatographic data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献