Suppr超能文献

鸟枪法蛋白质基因组学中变异肽段鉴定的错误发现率控制策略比较

Comparison of False Discovery Rate Control Strategies for Variant Peptide Identifications in Shotgun Proteogenomics.

作者信息

Ivanov Mark V, Lobas Anna A, Karpov Dmitry S, Moshkovskii Sergei A, Gorshkov Mikhail V

机构信息

Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences , Moscow 119334, Russia.

Moscow Institute of Physics and Technology (State University) , Moscow Region, Dolgoprudny 141700, Russia.

出版信息

J Proteome Res. 2017 May 5;16(5):1936-1943. doi: 10.1021/acs.jproteome.6b01014. Epub 2017 Mar 31.

Abstract

Proteogenomic studies aiming at identification of variant peptides using customized database searches of mass spectrometry data are facing a dilemma of selecting the most efficient database search strategy: A choice has to be made between using combined or sequential searches against reference (wild-type) and mutant protein databases or directly against the mutant database without the wild-type one. Here we called these approaches "all-together", "one-by-one", and "direct", respectively. We share the results of the comparison of these search strategies obtained for large data sets of publicly available proteogenomic data. On the basis of the results of this evaluation, we found that the "all-together" strategy provided, in general, more variant peptide identifications compared with the "one-by-one" approach, while showing similar performance for some specific cases. To validate further the results of this study, we performed a control comparison of the strategies in question using publicly available data for a mixture of the annotated human protein standard UPS1 and E. coli. For these data, both "all-together" and "one-by-one" approaches showed similar sensitivity and specificity of the searches, while the "direct" approach resulted in an increased number of false identifications.

摘要

旨在通过对质谱数据进行定制数据库搜索来鉴定变异肽段的蛋白质基因组学研究,正面临着选择最有效数据库搜索策略的困境:必须在针对参考(野生型)和突变蛋白质数据库进行组合或顺序搜索,还是直接针对不含野生型的突变数据库之间做出选择。在这里,我们分别将这些方法称为“一起搜索”“逐一搜索”和“直接搜索”。我们分享了针对公开可用的蛋白质基因组学大数据集所获得的这些搜索策略比较结果。基于该评估结果,我们发现,总体而言,与“逐一搜索”方法相比,“一起搜索”策略能鉴定出更多的变异肽段,不过在某些特定情况下表现相似。为了进一步验证本研究结果,我们使用公开可用的注释人类蛋白质标准品UPS1和大肠杆菌混合物的数据,对上述策略进行了对照比较。对于这些数据,“一起搜索”和“逐一搜索”方法的搜索灵敏度和特异性相似,而“直接搜索”方法导致错误鉴定数量增加。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验