Suppr超能文献

Protannotator:一种用于对“缺失”的人类蛋白质组进行染色体水平功能注释的半自动流程。

Protannotator: a semiautomated pipeline for chromosome-wise functional annotation of the "missing" human proteome.

作者信息

Islam Mohammad T, Garg Gagan, Hancock William S, Risk Brian A, Baker Mark S, Ranganathan Shoba

机构信息

Department of Chemistry and Biomolecular Sciences and ‡ARC Centre of Excellence in Bioinformatics, Macquarie University , Sydney, NSW 2109, Australia.

出版信息

J Proteome Res. 2014 Jan 3;13(1):76-83. doi: 10.1021/pr400794x. Epub 2013 Dec 13.

Abstract

The chromosome-centric human proteome project (C-HPP) aims to define the complete set of proteins encoded in each human chromosome. The neXtProt database (September 2013) lists 20,128 proteins for the human proteome, of which 3831 human proteins (∼19%) are considered "missing" according to the standard metrics table (released September 27, 2013). In support of the C-HPP initiative, we have extended the annotation strategy developed for human chromosome 7 "missing" proteins into a semiautomated pipeline to functionally annotate the "missing" human proteome. This pipeline integrates a suite of bioinformatics analysis and annotation software tools to identify homologues and map putative functional signatures, gene ontology, and biochemical pathways. From sequential BLAST searches, we have primarily identified homologues from reviewed nonhuman mammalian proteins with protein evidence for 1271 (33.2%) "missing" proteins, followed by 703 (18.4%) homologues from reviewed nonhuman mammalian proteins and subsequently 564 (14.7%) homologues from reviewed human proteins. Functional annotations for 1945 (50.8%) "missing" proteins were also determined. To accelerate the identification of "missing" proteins from proteomics studies, we generated proteotypic peptides in silico. Matching these proteotypic peptides to ENCODE proteogenomic data resulted in proteomic evidence for 107 (2.8%) of the 3831 "missing proteins, while evidence from a recent membrane proteomic study supported the existence for another 15 "missing" proteins. The chromosome-wise functional annotation of all "missing" proteins is freely available to the scientific community through our web server (http://biolinfo.org/protannotator).

摘要

以染色体为中心的人类蛋白质组计划(C-HPP)旨在确定每条人类染色体中编码的完整蛋白质集。neXtProt数据库(2013年9月)列出了人类蛋白质组的20128种蛋白质,根据标准指标表(2013年9月27日发布),其中3831种人类蛋白质(约19%)被认为是“缺失”的。为支持C-HPP计划,我们已将针对人类7号染色体“缺失”蛋白质开发的注释策略扩展为一个半自动化流程,以对“缺失”的人类蛋白质组进行功能注释。该流程整合了一套生物信息学分析和注释软件工具,以识别同源物并绘制推定的功能特征、基因本体和生化途径。通过连续的BLAST搜索,我们主要从经过审核的非人类哺乳动物蛋白质中鉴定出了1271种(33.2%)“缺失”蛋白质的同源物,这些蛋白质有蛋白质证据支持,其次是从经过审核的非人类哺乳动物蛋白质中鉴定出703种(18.4%)同源物,随后从经过审核的人类蛋白质中鉴定出564种(14.7%)同源物。还确定了1945种(50.8%)“缺失”蛋白质的功能注释。为了加速从蛋白质组学研究中鉴定“缺失”蛋白质,我们通过计算机模拟生成了蛋白质型肽段。将这些蛋白质型肽段与ENCODE蛋白质基因组数据进行匹配,为3831种“缺失”蛋白质中的107种(2.8%)提供了蛋白质组学证据,而最近一项膜蛋白质组学研究的证据支持另外15种“缺失”蛋白质的存在。所有“缺失”蛋白质的按染色体功能注释可通过我们的网络服务器(http://biolinfo.org/protannotator)免费提供给科学界。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验