• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PEPerMINT:基于图神经网络的质谱蛋白质组学中肽丰度推断。

PEPerMINT: peptide abundance imputation in mass spectrometry-based proteomics using graph neural networks.

机构信息

Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, 14482, Germany.

Department of Computer Science and Engineering, Indian Institute of Technology, Ropar, Rupnagar, 140001, India.

出版信息

Bioinformatics. 2024 Sep 1;40(Suppl 2):ii70-ii78. doi: 10.1093/bioinformatics/btae389.

DOI:10.1093/bioinformatics/btae389
PMID:39230699
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11373339/
Abstract

MOTIVATION

Accurate quantitative information about protein abundance is crucial for understanding a biological system and its dynamics. Protein abundance is commonly estimated using label-free, bottom-up mass spectrometry (MS) protocols. Here, proteins are digested into peptides before quantification via MS. However, missing peptide abundance values, which can make up more than 50% of all abundance values, are a common issue. They result in missing protein abundance values, which then hinder accurate and reliable downstream analyses.

RESULTS

To impute missing abundance values, we propose PEPerMINT, a graph neural network model working directly on the peptide level that flexibly takes both peptide-to-protein relationships in a graph format as well as amino acid sequence information into account. We benchmark our method against 11 common imputation methods on 6 diverse datasets, including cell lines, tissue, and plasma samples. We observe that PEPerMINT consistently outperforms other imputation methods. Its prediction performance remains high for varying degrees of missingness, different evaluation approaches, and differential expression prediction. As an additional novel feature, PEPerMINT provides meaningful uncertainty estimates and allows for tailoring imputation to the user's needs based on the reliability of imputed values.

AVAILABILITY AND IMPLEMENTATION

The code is available at https://github.com/DILiS-lab/pepermint.

摘要

动机

准确的蛋白质丰度定量信息对于理解生物系统及其动态至关重要。蛋白质丰度通常使用无标记、自下而上的质谱(MS)方法进行估计。在此,蛋白质在通过 MS 定量之前被消化成肽。然而,缺失的肽丰度值(超过所有丰度值的 50%)是一个常见的问题。这些缺失值导致了缺失的蛋白质丰度值,从而阻碍了下游准确和可靠的分析。

结果

为了估算缺失的丰度值,我们提出了 PEPerMINT,这是一种直接在肽水平上工作的图神经网络模型,灵活地考虑了图格式中的肽-蛋白关系以及氨基酸序列信息。我们在 6 个不同的数据集上,包括细胞系、组织和血浆样本,将我们的方法与 11 种常用的插补方法进行了基准测试。我们观察到,PEPerMINT 始终优于其他插补方法。对于不同程度的缺失、不同的评估方法和差异表达预测,其预测性能仍然很高。作为一个额外的新功能,PEPerMINT 提供了有意义的不确定性估计,并允许根据插补值的可靠性,根据用户的需求定制插补。

可用性和实现

代码可在 https://github.com/DILiS-lab/pepermint 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/eca0cf401904/btae389f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/e16a33eee611/btae389f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/1d7b3ada599b/btae389f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/34c80cc2f6db/btae389f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/bf66fe8c9910/btae389f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/731192c87e4d/btae389f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/eca0cf401904/btae389f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/e16a33eee611/btae389f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/1d7b3ada599b/btae389f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/34c80cc2f6db/btae389f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/bf66fe8c9910/btae389f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/731192c87e4d/btae389f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b6/11373339/eca0cf401904/btae389f6.jpg

相似文献

1
PEPerMINT: peptide abundance imputation in mass spectrometry-based proteomics using graph neural networks.PEPerMINT:基于图神经网络的质谱蛋白质组学中肽丰度推断。
Bioinformatics. 2024 Sep 1;40(Suppl 2):ii70-ii78. doi: 10.1093/bioinformatics/btae389.
2
A statistical framework for protein quantitation in bottom-up MS-based proteomics.基于质谱的蛋白质组学中蛋白质定量的统计框架。
Bioinformatics. 2009 Aug 15;25(16):2028-34. doi: 10.1093/bioinformatics/btp362. Epub 2009 Jun 17.
3
NAguideR: performing and prioritizing missing value imputations for consistent bottom-up proteomic analyses.NAguideR:执行和优先考虑缺失值插补以进行一致的从头蛋白质组学分析。
Nucleic Acids Res. 2020 Aug 20;48(14):e83. doi: 10.1093/nar/gkaa498.
4
Multiple Imputation Approaches Applied to the Missing Value Problem in Bottom-Up Proteomics.自下而上蛋白质组学中缺失值问题的多重插补方法。
Int J Mol Sci. 2021 Sep 6;22(17):9650. doi: 10.3390/ijms22179650.
5
MsImpute: Estimation of Missing Peptide Intensity Data in Label-Free Quantitative Mass Spectrometry.MsImpute:无标记定量质谱中缺失肽段强度数据的估计。
Mol Cell Proteomics. 2023 Aug;22(8):100558. doi: 10.1016/j.mcpro.2023.100558. Epub 2023 Apr 25.
6
Assessment of label-free quantification and missing value imputation for proteomics in non-human primates.非人类灵长类动物蛋白质组学中无标记定量和缺失值插补的评估。
BMC Genomics. 2022 Jul 8;23(1):496. doi: 10.1186/s12864-022-08723-1.
7
Accounting for multiple imputation-induced variability for differential analysis in mass spectrometry-based label-free quantitative proteomics.针对基于质谱的无标记定量蛋白质组学中差异分析的多重插补诱导变异性进行核算。
PLoS Comput Biol. 2022 Aug 29;18(8):e1010420. doi: 10.1371/journal.pcbi.1010420. eCollection 2022 Aug.
8
A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation.一种流行的蛋白质组学软件工作流程的综合评估,用于无标记蛋白质组定量和插补。
Brief Bioinform. 2018 Nov 27;19(6):1344-1355. doi: 10.1093/bib/bbx054.
9
GMSimpute: a generalized two-step Lasso approach to impute missing values in label-free mass spectrum analysis.GMSimpute:一种用于在无标记质谱分析中插补缺失值的广义两步套索方法。
Bioinformatics. 2020 Jan 1;36(1):257-263. doi: 10.1093/bioinformatics/btz488.
10
Data Imputation in Merged Isobaric Labeling-Based Relative Quantification Datasets.基于等压标记的相对定量合并数据集中的数据插补
Methods Mol Biol. 2020;2051:297-308. doi: 10.1007/978-1-4939-9744-2_13.

本文引用的文献

1
Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning.基于自监督深度学习的无标签定量蛋白质组学数据的推断。
Nat Commun. 2024 Jun 26;15(1):5405. doi: 10.1038/s41467-024-48711-5.
2
Evaluating Proteomics Imputation Methods with Improved Criteria.评估具有改进标准的蛋白质组学插补方法。
J Proteome Res. 2023 Nov 3;22(11):3427-3438. doi: 10.1021/acs.jproteome.3c00205. Epub 2023 Oct 20.
3
DEP2: an upgraded comprehensive analysis toolkit for quantitative proteomics data.DEP2:用于定量蛋白质组学数据的升级综合分析工具包。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad526.
4
ProJect: a powerful mixed-model missing value imputation method.ProJect:一种强大的混合模型缺失值插补方法。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad233.
5
Recent advances of data-independent acquisition mass spectrometry-based proteomics.基于数据非依赖采集质谱的蛋白质组学研究新进展
Proteomics. 2023 Apr;23(7-8):e2200011. doi: 10.1002/pmic.202200011.
6
Dealing with missing values in proteomics data.处理蛋白质组学数据中的缺失值。
Proteomics. 2022 Dec;22(23-24):e2200092. doi: 10.1002/pmic.202200092. Epub 2022 Nov 17.
7
Immunologically "cold" triple negative breast cancers engraft at a higher rate in patient derived xenografts.免疫“冷”型三阴性乳腺癌在患者来源的异种移植模型中具有更高的移植率。
NPJ Breast Cancer. 2022 Sep 10;8(1):104. doi: 10.1038/s41523-022-00476-0.
8
Histone lysine demethylase inhibition reprograms prostate cancer metabolism and mechanics.组蛋白赖氨酸去甲基化酶抑制可重新编程前列腺癌代谢和力学。
Mol Metab. 2022 Oct;64:101561. doi: 10.1016/j.molmet.2022.101561. Epub 2022 Aug 6.
9
Using plasma proteomics to investigate viral infections of the central nervous system including patients with HIV-associated neurocognitive disorders.利用血浆蛋白质组学研究包括 HIV 相关神经认知障碍患者在内的中枢神经系统病毒感染。
J Neurovirol. 2022 Jun;28(3):341-354. doi: 10.1007/s13365-022-01077-0. Epub 2022 May 31.
10
Predicting missing proteomics values using machine learning: Filling the gap using transcriptomics and other biological features.利用机器学习预测缺失的蛋白质组学值:利用转录组学和其他生物学特征填补空白
Comput Struct Biotechnol J. 2022 Apr 22;20:2057-2069. doi: 10.1016/j.csbj.2022.04.017. eCollection 2022.