• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多变量分类模型在分析 NMR 代谢组学数据中的应用评估。

Evaluation of Multivariate Classification Models for Analyzing NMR Metabolomics Data.

机构信息

Department of Statistics , University of Nebraska-Lincoln , Lincoln , Nebraska 68583-0963 , United States.

Department of Chemistry , University of Nebraska-Lincoln , Lincoln , Nebraska 68588-0304 , United States.

出版信息

J Proteome Res. 2019 Sep 6;18(9):3282-3294. doi: 10.1021/acs.jproteome.9b00227. Epub 2019 Aug 22.

DOI:10.1021/acs.jproteome.9b00227
PMID:31382745
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6733656/
Abstract

Analytical techniques such as NMR and mass spectrometry can generate large metabolomics data sets containing thousands of spectral features derived from numerous biological observations. Multivariate data analysis is routinely used to uncover the underlying biological information contained within these large metabolomics data sets. This is typically accomplished by classifying the observations into groups (e.g., control versus treated) and by identifying associated discriminating features. There are a variety of classification models to select from, which include some well-established techniques (e.g., principal component analysis [PCA], orthogonal projection to latent structure [OPLS], or partial least-squares projection to latent structures [PLS]) and newly emerging machine learning algorithms (e.g., support vector machines or random forests). However, it is unclear which classification model, if any, is an optimal choice for the analysis of metabolomics data. Herein, we present a comprehensive evaluation of five common classification models routinely employed in the metabolomics field and that are also currently available in our MVAPACK metabolomics software package. Simulated and experimental NMR data sets with various levels of group separation were used to evaluate each model. Model performance was assessed by classification accuracy rate, by the area under a receiver operating characteristic (AUROC) curve, and by the identification of true discriminating features. Our findings suggest that the five classification models perform equally well with robust data sets. Only when the models are stressed with subtle data set differences does OPLS emerge as the best-performing model. OPLS maintained a high-prediction accuracy rate and a large area under the ROC curve while yielding loadings closest to the true loadings with limited group separations.

摘要

分析技术,如 NMR 和质谱,可以生成包含数千个光谱特征的大型代谢组学数据集,这些特征来自于大量的生物学观察。多元数据分析通常用于揭示这些大型代谢组学数据集中包含的潜在生物学信息。这通常通过将观察结果分类为组(例如,对照与处理)并识别相关的区分特征来实现。有多种分类模型可供选择,包括一些成熟的技术(例如,主成分分析 [PCA]、正交投影到潜在结构 [OPLS] 或偏最小二乘投影到潜在结构 [PLS])和新出现的机器学习算法(例如,支持向量机或随机森林)。然而,尚不清楚哪种分类模型(如果有的话)是代谢组学数据分析的最佳选择。本文中,我们全面评估了代谢组学领域常用的五种常见分类模型,这些模型也可在我们的 MVAPACK 代谢组学软件包中使用。使用具有不同分组分离程度的模拟和实验 NMR 数据集来评估每个模型。通过分类准确率、接收者操作特征 (ROC) 曲线下的面积以及真实区分特征的识别来评估模型性能。我们的研究结果表明,这五种分类模型在稳健的数据集中表现相当。只有当模型受到微妙的数据集中的差异的影响时,OPLS 才会成为表现最好的模型。OPLS 在具有有限分组分离的情况下保持了较高的预测准确率和较大的 ROC 曲线下面积,同时产生了与真实载荷最接近的载荷。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/28000ff87510/nihms-1046037-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/b25e38ba4746/nihms-1046037-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/7dfdf3e5f35a/nihms-1046037-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/5c937e35c0ef/nihms-1046037-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/961a43e8f92c/nihms-1046037-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/e1f5b161482c/nihms-1046037-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/69fbdf11766c/nihms-1046037-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/28000ff87510/nihms-1046037-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/b25e38ba4746/nihms-1046037-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/7dfdf3e5f35a/nihms-1046037-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/5c937e35c0ef/nihms-1046037-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/961a43e8f92c/nihms-1046037-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/e1f5b161482c/nihms-1046037-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/69fbdf11766c/nihms-1046037-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da74/6733656/28000ff87510/nihms-1046037-f0008.jpg

相似文献

1
Evaluation of Multivariate Classification Models for Analyzing NMR Metabolomics Data.多变量分类模型在分析 NMR 代谢组学数据中的应用评估。
J Proteome Res. 2019 Sep 6;18(9):3282-3294. doi: 10.1021/acs.jproteome.9b00227. Epub 2019 Aug 22.
2
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
3
A tutorial review: Metabolomics and partial least squares-discriminant analysis--a marriage of convenience or a shotgun wedding.一篇教程综述:代谢组学与偏最小二乘判别分析——是权宜结合还是仓促结合。
Anal Chim Acta. 2015 Jun 16;879:10-23. doi: 10.1016/j.aca.2015.02.012. Epub 2015 Feb 11.
4
A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification.八种机器学习算法在十个临床代谢组学数据集上进行二进制分类的广义预测能力的比较评估。
Metabolomics. 2019 Nov 15;15(12):150. doi: 10.1007/s11306-019-1612-4.
5
Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: application to the detection of breast cancer.用于代谢组学中结合核磁共振和质谱数据的主成分导向偏最小二乘分析:在乳腺癌检测中的应用
Anal Chim Acta. 2011 Feb 7;686(1-2):57-63. doi: 10.1016/j.aca.2010.11.040. Epub 2010 Nov 26.
6
Comparing normalization methods and the impact of noise.比较归一化方法和噪声的影响。
Metabolomics. 2018 Aug 10;14(8):108. doi: 10.1007/s11306-018-1400-6.
7
Comparative analysis of targeted metabolomics: dominance-based rough set approach versus orthogonal partial least square-discriminant analysis.靶向代谢组学的比较分析:基于优势的粗糙集方法与正交偏最小二乘判别分析
J Biomed Inform. 2015 Feb;53:291-9. doi: 10.1016/j.jbi.2014.12.001. Epub 2014 Dec 11.
8
Performance evaluation of algorithms for the classification of metabolic 1H NMR fingerprints.代谢 1H NMR 指纹图谱分类算法的性能评估。
J Proteome Res. 2012 Dec 7;11(12):6242-51. doi: 10.1021/pr3009034. Epub 2012 Nov 12.
9
Study on plasmatic metabolomics of Uygur patients with essential hypertension based on nuclear magnetic resonance technique.基于核磁共振技术的维吾尔族原发性高血压患者血浆代谢组学研究
Eur Rev Med Pharmacol Sci. 2014;18(23):3673-80.
10
Statistical analysis and modeling of mass spectrometry-based metabolomics data.基于质谱的代谢组学数据的统计分析与建模
Methods Mol Biol. 2014;1198:333-53. doi: 10.1007/978-1-4939-1258-2_22.

引用本文的文献

1
Best Practices in NMR Metabolomics: Current State.核磁共振代谢组学的最佳实践:现状
Trends Analyt Chem. 2024 Feb;171. doi: 10.1016/j.trac.2023.117478. Epub 2023 Dec 12.
2
Identification of Plasma Metabolites Responding to Oxycodone Exposure in Rats.鉴定大鼠体内对羟考酮暴露产生反应的血浆代谢物。
Metabolites. 2025 Feb 4;15(2):95. doi: 10.3390/metabo15020095.
3
Simulated LC-MS Data Set for Assessing the Metabolomics Data Processing Pipeline Implemented into MVAPACK.MVAPACK 中实现的代谢组学数据处理管道评估的模拟 LC-MS 数据集。

本文引用的文献

1
Combining Mass Spectrometry and NMR Improves Metabolite Detection and Annotation.质谱和 NMR 的联用提高了代谢物的检测和注释。
J Proteome Res. 2018 Nov 2;17(11):4017-4022. doi: 10.1021/acs.jproteome.8b00567. Epub 2018 Oct 12.
2
HMDB 4.0: the human metabolome database for 2018.HMDB 4.0:2018 年人类代谢组数据库。
Nucleic Acids Res. 2018 Jan 4;46(D1):D608-D617. doi: 10.1093/nar/gkx1089.
3
NMR and MS Methods for Metabolomics.代谢组学的核磁共振和质谱方法。
Anal Chem. 2024 Aug 13;96(32):12943-12956. doi: 10.1021/acs.analchem.3c04979. Epub 2024 Jul 30.
4
Shifting-corrected regularized regression for 1H NMR metabolomics identification and quantification.基于移变校正的正则化回归在 1H NMR 代谢组学鉴定和定量中的应用
Biostatistics. 2022 Dec 12;24(1):140-160. doi: 10.1093/biostatistics/kxac015.
5
Applications of machine learning in metabolomics: Disease modeling and classification.机器学习在代谢组学中的应用:疾病建模与分类。
Front Genet. 2022 Nov 24;13:1017340. doi: 10.3389/fgene.2022.1017340. eCollection 2022.
6
Effects of enteral nutrition with different energy supplies on metabolic changes and organ damage in burned rats.不同能量供应的肠内营养对烧伤大鼠代谢变化和器官损伤的影响。
Burns Trauma. 2022 Nov 21;10:tkac042. doi: 10.1093/burnst/tkac042. eCollection 2022.
7
Differentiation of Geographical Origin of White and Brown Rice Samples Using NMR Spectroscopy Coupled with Machine Learning Techniques.利用核磁共振光谱结合机器学习技术鉴别白米和糙米样本的地理来源
Metabolites. 2022 Oct 24;12(11):1012. doi: 10.3390/metabo12111012.
8
Metabolite quantification: A fluorescence-based method for urine sample normalization prior to H-NMR analysis.代谢物定量:一种用于在氢核磁共振分析之前对尿液样本进行标准化的基于荧光的方法。
Metabolomics. 2022 Oct 19;18(11):80. doi: 10.1007/s11306-022-01939-y.
9
Effects of Different Ratios of Carbohydrate-Fat in Enteral Nutrition on Metabolic Pattern and Organ Damage in Burned Rats.不同碳水化合物-脂肪比例的肠内营养对烧伤大鼠代谢模式和器官损伤的影响。
Nutrients. 2022 Sep 4;14(17):3653. doi: 10.3390/nu14173653.
10
Maternal early-pregnancy body mass index-associated metabolomic component and mental and behavioral disorders in children.母亲早孕时的体重指数相关代谢成分与儿童的精神和行为障碍。
Mol Psychiatry. 2022 Nov;27(11):4653-4661. doi: 10.1038/s41380-022-01723-3. Epub 2022 Aug 10.
Methods Mol Biol. 2017;1641:229-258. doi: 10.1007/978-1-4939-7172-5_13.
4
Beyond the paradigm: Combining mass spectrometry and nuclear magnetic resonance for metabolomics.超越范式:质谱与核磁共振联用的代谢组学研究
Prog Nucl Magn Reson Spectrosc. 2017 May;100:1-16. doi: 10.1016/j.pnmrs.2017.01.001. Epub 2017 Jan 11.
5
PCA as a practical indicator of OPLS-DA model reliability.主成分分析(PCA)作为正交投影到潜在结构判别分析(OPLS-DA)模型可靠性的实用指标。
Curr Metabolomics. 2016;4(2):97-103. doi: 10.2174/2213235X04666160613122429.
6
Emerging applications of metabolomics in drug discovery and precision medicine.代谢组学在药物发现和精准医学中的新兴应用。
Nat Rev Drug Discov. 2016 Jul;15(7):473-84. doi: 10.1038/nrd.2016.32. Epub 2016 Mar 11.
7
Multivariate Analysis in Metabolomics.代谢组学中的多变量分析
Curr Metabolomics. 2013;1(1):92-107. doi: 10.2174/2213235X11301010092.
8
A tutorial review: Metabolomics and partial least squares-discriminant analysis--a marriage of convenience or a shotgun wedding.一篇教程综述:代谢组学与偏最小二乘判别分析——是权宜结合还是仓促结合。
Anal Chim Acta. 2015 Jun 16;879:10-23. doi: 10.1016/j.aca.2015.02.012. Epub 2015 Feb 11.
9
Overcome support vector machine diagnosis overfitting.克服支持向量机诊断的过拟合问题。
Cancer Inform. 2014 Dec 9;13(Suppl 1):145-58. doi: 10.4137/CIN.S13875. eCollection 2014.
10
MVAPACK: a complete data handling package for NMR metabolomics.MVAPACK:一个用于核磁共振代谢组学的完整数据处理软件包。
ACS Chem Biol. 2014 May 16;9(5):1138-44. doi: 10.1021/cb4008937. Epub 2014 Mar 7.