• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用经过更新的 Aristotle 分类器的蛋白质组学数据提高疾病状态的区分能力。

Improved Discrimination of Disease States Using Proteomics Data with the Updated Aristotle Classifier.

机构信息

Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States.

出版信息

J Proteome Res. 2021 May 7;20(5):2823-2829. doi: 10.1021/acs.jproteome.1c00066. Epub 2021 Apr 28.

DOI:10.1021/acs.jproteome.1c00066
PMID:33909976
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8541691/
Abstract

Mass spectrometry data sets from omics studies are an optimal information source for discriminating patients with disease and identifying biomarkers. Thousands of proteins or endogenous metabolites can be queried in each analysis, spanning several orders of magnitude in abundance. Machine learning tools that effectively leverage these data to accurately identify disease states are in high demand. While mass spectrometry data sets are rich with potentially useful information, using the data effectively can be challenging because of missing entries in the data sets and because the number of samples is typically much smaller than the number of features, two challenges that make machine learning difficult. To address this problem, we have modified a new supervised classification tool, the Aristotle Classifier, so that omics data sets can be better leveraged for identifying disease states. The optimized classifier, AC.2021, is benchmarked on multiple data sets against its predecessor and two leading supervised classification tools, Support Vector Machine (SVM) and XGBoost. The new classifier, AC.2021, outperformed existing tools on multiple tests using proteomics data. The underlying code for the classifier, provided herein, would be useful for researchers who desire improved classification accuracy when using their omics data sets to identify disease states.

摘要

组学研究的质谱数据集是区分疾病患者和识别生物标志物的最佳信息来源。在每次分析中可以查询数千种蛋白质或内源性代谢物,其丰度跨越几个数量级。能够有效利用这些数据准确识别疾病状态的机器学习工具需求量很大。虽然质谱数据集富含潜在有用的信息,但由于数据集存在缺失项,并且样本数量通常远小于特征数量,这两个挑战使得机器学习变得困难,因此有效地使用这些数据具有挑战性。为了解决这个问题,我们修改了一种新的有监督分类工具,即亚里士多德分类器,以便更好地利用组学数据集来识别疾病状态。优化后的分类器 AC.2021 在多个数据集上与前代产品以及两种领先的有监督分类工具支持向量机(SVM)和 XGBoost 进行了基准测试。新的分类器 AC.2021 在使用蛋白质组学数据进行的多项测试中均优于现有工具。此处提供的分类器的基础代码对于希望在使用组学数据集识别疾病状态时提高分类准确性的研究人员很有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/7aadb20e68e0/nihms-1748526-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/b34b1539aefb/nihms-1748526-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/d49b1e55f3a2/nihms-1748526-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/7aadb20e68e0/nihms-1748526-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/b34b1539aefb/nihms-1748526-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/d49b1e55f3a2/nihms-1748526-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c7b/8541691/7aadb20e68e0/nihms-1748526-f0004.jpg

相似文献

1
Improved Discrimination of Disease States Using Proteomics Data with the Updated Aristotle Classifier.使用经过更新的 Aristotle 分类器的蛋白质组学数据提高疾病状态的区分能力。
J Proteome Res. 2021 May 7;20(5):2823-2829. doi: 10.1021/acs.jproteome.1c00066. Epub 2021 Apr 28.
2
LANDMark: an ensemble approach to the supervised selection of biomarkers in high-throughput sequencing data.LANDMark:一种基于集成方法的高通量测序数据中生物标志物的有监督选择。
BMC Bioinformatics. 2022 Mar 31;23(1):110. doi: 10.1186/s12859-022-04631-z.
3
Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets.我们是否需要不同的机器学习算法来进行定量构效关系建模?对 16 种机器学习算法在 14 个定量构效关系数据集上的综合评估。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa321.
4
Clinically Applicable Deep Learning Algorithm Using Quantitative Proteomic Data.临床适用的深度学习算法,利用定量蛋白质组学数据。
J Proteome Res. 2019 Aug 2;18(8):3195-3202. doi: 10.1021/acs.jproteome.9b00268. Epub 2019 Jul 17.
5
Identification of diagnostic markers for tuberculosis by proteomic fingerprinting of serum.通过血清蛋白质组指纹图谱鉴定结核病的诊断标志物。
Lancet. 2006 Sep 16;368(9540):1012-21. doi: 10.1016/S0140-6736(06)69342-2.
6
A topological data analysis based classification method for multiple measurements.基于拓扑数据分析的多测量分类方法。
BMC Bioinformatics. 2020 Jul 29;21(1):336. doi: 10.1186/s12859-020-03659-3.
7
Adaption of the Aristotle Classifier for Accurately Identifying Highly Similar Bacteria Analyzed by MALDI-TOF MS.基于 MALDI-TOF MS 分析的高度相似细菌的准确识别的亚里士多德分类器的改编。
Anal Chem. 2020 Jan 7;92(1):1050-1057. doi: 10.1021/acs.analchem.9b04049. Epub 2019 Dec 10.
8
Analysis of structural brain MRI and multi-parameter classification for Alzheimer's disease.阿尔茨海默病的脑结构磁共振成像分析及多参数分类
Biomed Tech (Berl). 2018 Jul 26;63(4):427-437. doi: 10.1515/bmt-2016-0239.
9
Comparing different algorithms for the course of Alzheimer's disease using machine learning.使用机器学习比较阿尔茨海默病病程的不同算法。
Ann Palliat Med. 2021 Sep;10(9):9715-9724. doi: 10.21037/apm-21-2013.
10
Machine Learning Strategies to Tackle Data Challenges in Mass Spectrometry-Based Proteomics.基于质谱的蛋白质组学中应对数据挑战的机器学习策略。
J Am Soc Mass Spectrom. 2024 Sep 4;35(9):2143-2155. doi: 10.1021/jasms.4c00180. Epub 2024 Jul 29.

引用本文的文献

1
Exploring Sample Storage Conditions for the Mass Spectrometric Analysis of Extracted Lipids from Latent Fingerprints.探索用于潜指纹提取脂质质谱分析的样本储存条件。
Biomolecules. 2025 Mar 25;15(4):477. doi: 10.3390/biom15040477.
2
Skin Surface Sebum Analysis by ESI-MS.利用电喷雾质谱法进行皮肤表面皮脂分析。
Biomolecules. 2024 Jul 3;14(7):790. doi: 10.3390/biom14070790.
3
Workflow for Evaluating Normalization Tools for Omics Data Using Supervised and Unsupervised Machine Learning.使用监督式和非监督式机器学习评估组学数据标准化工具的工作流程。

本文引用的文献

1
Why Inclusion Matters for Alzheimer's Disease Biomarker Discovery in Plasma.为什么在血浆阿尔茨海默病生物标志物的发现中需要考虑包容性。
J Alzheimers Dis. 2021;79(3):1327-1344. doi: 10.3233/JAD-201318.
2
Coupled Mass-Spectrometry-Based Lipidomics Machine Learning Approach for Early Detection of Clear Cell Renal Cell Carcinoma.基于耦合质谱的脂质组学生物机器学习方法在透明细胞肾细胞癌早期检测中的应用。
J Proteome Res. 2021 Jan 1;20(1):841-857. doi: 10.1021/acs.jproteome.0c00663. Epub 2020 Nov 18.
3
Metabolomic Nuclear Magnetic Resonance Studies at Presymptomatic and Symptomatic Stages of Huntington's Disease on a Model.
J Am Soc Mass Spectrom. 2023 Dec 6;34(12):2775-2784. doi: 10.1021/jasms.3c00295. Epub 2023 Oct 28.
4
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools.使用现成的机器学习工具,以超过99%的准确率区分学术科学写作与人类或ChatGPT所写内容。
Cell Rep Phys Sci. 2023 Jun 21;4(6). doi: 10.1016/j.xcrp.2023.101426. Epub 2023 Jun 7.
5
Advances, obstacles, and opportunities for machine learning in proteomics.蛋白质组学中机器学习的进展、障碍与机遇
Cell Rep Phys Sci. 2022 Oct 19;3(10). doi: 10.1016/j.xcrp.2022.101069. Epub 2022 Sep 22.
6
How (Not) to Generate a Highly Predictive Biomarker Panel Using Machine Learning.如何(不)使用机器学习生成高度可预测的生物标志物面板。
J Proteome Res. 2022 Sep 2;21(9):2071-2074. doi: 10.1021/acs.jproteome.2c00117. Epub 2022 Aug 25.
7
Exposing the Brain Proteomic Signatures of Alzheimer's Disease in Diverse Racial Groups: Leveraging Multiple Data Sets and Machine Learning.揭示不同种族群体中阿尔茨海默病的脑蛋白质组学特征:利用多个数据集和机器学习
J Proteome Res. 2022 Apr 1;21(4):1095-1104. doi: 10.1021/acs.jproteome.1c00966. Epub 2022 Mar 11.
8
LC-MS peak assignment based on unanimous selection by six machine learning algorithms.基于六种机器学习算法一致选择的 LC-MS 峰分配。
Sci Rep. 2021 Dec 3;11(1):23411. doi: 10.1038/s41598-021-02899-4.
代谢组学磁共振研究在亨廷顿病的无症状和有症状阶段的模型上。
J Proteome Res. 2020 Oct 2;19(10):4034-4045. doi: 10.1021/acs.jproteome.0c00335. Epub 2020 Sep 22.
4
Lipid Profiling in Epicardial and Subcutaneous Adipose Tissue of Patients with Coronary Artery Disease.冠心病患者心外膜和皮下脂肪组织的脂类分析。
J Proteome Res. 2020 Oct 2;19(10):3993-4003. doi: 10.1021/acs.jproteome.0c00269. Epub 2020 Sep 8.
5
Metabolic Phenotyping Study of Mouse Brains Following Acute or Chronic Exposures to Ethanol.急性或慢性乙醇暴露后小鼠大脑的代谢表型研究。
J Proteome Res. 2020 Oct 2;19(10):4071-4081. doi: 10.1021/acs.jproteome.0c00440. Epub 2020 Sep 10.
6
Metabolomics Study Revealing the Potential Risk and Predictive Value of Fragmented QRS for Acute Myocardial Infarction.代谢组学研究揭示碎裂 QRS 对急性心肌梗死的潜在风险和预测价值。
J Proteome Res. 2020 Aug 7;19(8):3386-3395. doi: 10.1021/acs.jproteome.0c00247. Epub 2020 Jul 13.
7
Proteome profiling in cerebrospinal fluid reveals novel biomarkers of Alzheimer's disease.脑脊液蛋白质组谱分析揭示阿尔茨海默病的新型生物标志物。
Mol Syst Biol. 2020 Jun;16(6):e9356. doi: 10.15252/msb.20199356.
8
How to Apply Supervised Machine Learning Tools to MS Imaging Files: Case Study with Cancer Spheroids Undergoing Treatment with the Monoclonal Antibody Cetuximab.如何将监督机器学习工具应用于 MS 成像文件:以接受单克隆抗体西妥昔单抗治疗的癌症球体为例的研究。
J Am Soc Mass Spectrom. 2020 Jul 1;31(7):1350-1357. doi: 10.1021/jasms.0c00010. Epub 2020 Jun 10.
9
Metabolomics and Lipidomics Profiling in Asymptomatic Severe Intracranial Arterial Stenosis: Results from a Population-Based Study.无症状性严重颅内动脉狭窄的代谢组学和脂质组学特征:一项基于人群的研究结果。
J Proteome Res. 2020 Jun 5;19(6):2206-2216. doi: 10.1021/acs.jproteome.9b00644. Epub 2020 Apr 23.
10
Fingerprinting Alzheimer's Disease by H Nuclear Magnetic Resonance Spectroscopy of Cerebrospinal Fluid.利用脑脊液 H 磁共振波谱技术对阿尔茨海默病进行指纹识别。
J Proteome Res. 2020 Apr 3;19(4):1696-1705. doi: 10.1021/acs.jproteome.9b00850. Epub 2020 Mar 12.