Suppr超能文献

FineFDR:宏蛋白质组学中细粒度分类学特异性错误发现率控制

FineFDR: Fine-grained Taxonomy-specific False Discovery Rates Control in Metaproteomics.

作者信息

Wang Shengze, Feng Shichao, Pan Chongle, Guo Xuan

机构信息

Department of Computer Science and Engineering University of North Texas, Denton, TX 76207, United States.

School of Computer Science Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK 73019, United States.

出版信息

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2022 Dec;2022:287-292. doi: 10.1109/bibm55620.2022.9995401. Epub 2023 Jan 2.

Abstract

Microbial community proteomics, also termed metaproteomics, investigates all proteins expressed by a microbiota. Tandem mass spectrometry (MS/MS) is the typical method for identifying proteins in metaproteomics, which involves searching the mass spectra against a protein sequence database. A major post-analysis step is controlling the false discovery rate (FDR), i.e., the ratio of false positives to the total number of annotations. The current popular target-decoy FDR estimation method treats all the peptides and proteins equally and overlooks that they could have varied probabilities of being identified. In this study, we report FineFDR, a framework for FDR assessment at fine-grained levels with taxonomy information considered. FineFDR groups the identified peptide-spectrum matches, peptides, and proteins from different taxonomic units and estimates the FDR in each group separately. Empirical experiments on the simulated and real-world data sets demonstrate that our FineFDR achieved higher precision and more peptide and protein identifications when compared to the state-of-the-art methods, such as Comet, Percolator, TIDD, and Tailor. FineFDR is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/FDR.

摘要

微生物群落蛋白质组学,也称为宏蛋白质组学,研究微生物群表达的所有蛋白质。串联质谱(MS/MS)是宏蛋白质组学中鉴定蛋白质的典型方法,该方法涉及针对蛋白质序列数据库搜索质谱图。一个主要的分析后步骤是控制错误发现率(FDR),即假阳性与注释总数的比率。当前流行的目标诱饵FDR估计方法平等对待所有肽段和蛋白质,而忽略了它们被鉴定的概率可能不同。在本研究中,我们报告了FineFDR,这是一个在考虑分类信息的情况下进行细粒度水平FDR评估的框架。FineFDR对来自不同分类单元的已鉴定肽段-谱匹配、肽段和蛋白质进行分组,并分别估计每组中的FDR。在模拟和真实数据集上进行的实证实验表明,与Comet、Percolator、TIDD和Tailor等现有方法相比,我们的FineFDR具有更高的精度,并且鉴定出了更多的肽段和蛋白质。FineFDR可在GNU GPL许可下免费获取,网址为https://github.com/Biocomputing-Research-Group/FDR。

相似文献

1
FineFDR: Fine-grained Taxonomy-specific False Discovery Rates Control in Metaproteomics.FineFDR:宏蛋白质组学中细粒度分类学特异性错误发现率控制
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2022 Dec;2022:287-292. doi: 10.1109/bibm55620.2022.9995401. Epub 2023 Jan 2.
2
Deep learning for peptide identification from metaproteomics datasets.基于深度学习的宏蛋白质组学数据肽段鉴定。
J Proteomics. 2021 Sep 15;247:104316. doi: 10.1016/j.jprot.2021.104316. Epub 2021 Jul 8.
5
False discovery rates in spectral identification.光谱识别中的假发现率。
BMC Bioinformatics. 2012;13 Suppl 16(Suppl 16):S2. doi: 10.1186/1471-2105-13-S16-S2. Epub 2012 Nov 5.

本文引用的文献

3
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验