Suppr超能文献

PGMiner升级版,将基因组与蛋白质组相联系的全自动蛋白质基因组注释工具。

PGMiner reloaded, fully automated proteogenomic annotation tool linking genomes to proteomes.

作者信息

Has Canan, Lashin Sergey A, Kochetov Alexey V, Allmer Jens

出版信息

J Integr Bioinform. 2016 Dec 18;13(4):293. doi: 10.2390/biecoll-jib-2016-293.

Abstract

Improvements in genome sequencing technology increased the availability of full genomes and transcriptomes of many organisms. However, the major benefit of massive parallel sequencing is to better understand the organization and function of genes which then lead to understanding of phenotypes. In order to interpret genomic data with automated gene annotation studies, several tools are currently available. Even though the accuracy of computational gene annotation is increasing, a combination of multiple lines of experimental evidences should be gathered. Mass spectrometry allows the identification and sequencing of proteins as major gene products; and it is only these proteins that conclusively show whether a part of a genome is a coding region or not to result in phenotypes. Therefore, in the field of proteogenomics, the validation of computational methods is done by exploiting mass spectrometric data. As a result, identification of novel protein coding regions, validation of current gene models, and determination of upstream and downstream regions of genes can be achieved. In this paper, we present new functionality for our proteogenomic tool, PGMiner which performs all proteogenomic steps like acquisition of mass spectrometric data, peptide identification against preprocessed sequence databases, assignment of statistical confidence to identified peptides, mapping confident peptides to gene models, and result visualization. The extensions cover determining proteotypic peptides and thus unambiguous protein identification. Furthermore, peptides conflicting with gene models can now automatically assessed within the context of predicted alternative open reading frames.

摘要

基因组测序技术的改进提高了许多生物全基因组和转录组的可得性。然而,大规模平行测序的主要好处是能更好地理解基因的组织和功能,进而有助于理解表型。为了通过自动基因注释研究来解读基因组数据,目前有几种工具可供使用。尽管计算基因注释的准确性在不断提高,但仍应收集多条实验证据。质谱分析能够鉴定和测序作为主要基因产物的蛋白质;只有这些蛋白质才能最终表明基因组的某一部分是否为编码区并导致表型。因此,在蛋白质基因组学领域,通过利用质谱数据来验证计算方法。结果,可以实现新的蛋白质编码区的鉴定、当前基因模型的验证以及基因上下游区域的确定。在本文中,我们展示了我们的蛋白质基因组学工具PGMiner的新功能,该工具执行所有蛋白质基因组学步骤,如质谱数据的获取、针对预处理序列数据库的肽段鉴定、为鉴定的肽段赋予统计置信度、将可信肽段映射到基因模型以及结果可视化。扩展功能包括确定蛋白质型肽段,从而实现明确的蛋白质鉴定。此外,现在可以在预测的替代开放阅读框的背景下自动评估与基因模型冲突的肽段。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验