Suppr超能文献

从生物信息学角度看蛋白质基因组学:一个不断发展的领域。

Proteogenomics from a bioinformatics angle: A growing field.

机构信息

Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Lab of Bioinformatics and Computational Genomics, Ghent University, Ghent, Belgium.

Department of Biochemistry and Molecular Pharmacology, Center for Health Informatics and Bioinformatics, New York University School of Medicine, New York, NY.

出版信息

Mass Spectrom Rev. 2017 Sep;36(5):584-599. doi: 10.1002/mas.21483. Epub 2015 Dec 15.

Abstract

Proteogenomics is a research area that combines areas as proteomics and genomics in a multi-omics setup using both mass spectrometry and high-throughput sequencing technologies. Currently, the main goals of the field are to aid genome annotation or to unravel the proteome complexity. Mass spectrometry based identifications of matching or homologues peptides can further refine gene models. Also, the identification of novel proteoforms is also made possible based on detection of novel translation initiation sites (cognate or near-cognate), novel transcript isoforms, sequence variation or novel (small) open reading frames in intergenic or un-translated genic regions by analyzing high-throughput sequencing data from RNAseq or ribosome profiling experiments. Other proteogenomics studies using a combination of proteomics and genomics techniques focus on antibody sequencing, the identification of immunogenic peptides or venom peptides. Over the years, a growing amount of bioinformatics tools and databases became available to help streamlining these cross-omics studies. Some of these solutions only help in specific steps of the proteogenomics studies, e.g. building custom sequence databases (based on next generation sequencing output) for mass spectrometry fragmentation spectrum matching. Over the last few years a handful integrative tools also became available that can execute complete proteogenomics analyses. Some of these are presented as stand-alone solutions, whereas others are implemented in a web-based framework such as Galaxy. In this review we aimed at sketching a comprehensive overview of all the bioinformatics solutions that are available for this growing research area. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:584-599, 2017.

摘要

蛋白质基因组学是一个将蛋白质组学和基因组学领域结合起来的研究领域,在多组学设置中使用质谱和高通量测序技术。目前,该领域的主要目标是辅助基因组注释或揭示蛋白质组的复杂性。基于质谱的匹配或同源肽的鉴定可以进一步完善基因模型。此外,基于检测新的翻译起始位点(同源或近同源)、新的转录本异构体、序列变异或基因间或非翻译基因区域中新的(小)开放阅读框,通过分析 RNAseq 或核糖体分析实验的高通量测序数据,也可以鉴定新的蛋白质形式。其他使用蛋白质组学和基因组学技术组合的蛋白质基因组学研究侧重于抗体测序、免疫肽或毒液肽的鉴定。多年来,越来越多的生物信息学工具和数据库可用于帮助简化这些跨组学研究。其中一些解决方案仅有助于蛋白质基因组学研究的特定步骤,例如为质谱碎裂谱匹配构建定制的序列数据库(基于下一代测序输出)。在过去的几年中,也出现了一些综合工具,可以执行完整的蛋白质基因组学分析。其中一些是作为独立的解决方案提供的,而另一些则是在 Galaxy 等基于网络的框架中实现的。在这篇综述中,我们旨在全面概述所有可用于这一日益发展的研究领域的生物信息学解决方案。

相似文献

引用本文的文献

本文引用的文献

4
Neoantigens in cancer immunotherapy.肿瘤免疫治疗中的新生抗原
Science. 2015 Apr 3;348(6230):69-74. doi: 10.1126/science.aaa4971.
7
Using REDItools to Detect RNA Editing Events in NGS Datasets.使用REDItools在NGS数据集中检测RNA编辑事件。
Curr Protoc Bioinformatics. 2015 Mar 9;49:12.12.1-12.12.15. doi: 10.1002/0471250953.bi1212s49.
8
Genome sequence-independent identification of RNA editing sites.不依赖基因组序列的RNA编辑位点鉴定
Nat Methods. 2015 Apr;12(4):347-50. doi: 10.1038/nmeth.3314. Epub 2015 Mar 2.
9
A decoy-free approach to the identification of peptides.一种用于鉴定肽段的无诱饵方法。
J Proteome Res. 2015 Apr 3;14(4):1792-8. doi: 10.1021/pr501164r. Epub 2015 Mar 6.
10
Multi-omic data analysis using Galaxy.使用Galaxy进行多组学数据分析。
Nat Biotechnol. 2015 Feb;33(2):137-9. doi: 10.1038/nbt.3134.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验