Suppr超能文献

珠穆朗玛峰:进化保守蛋白结构域的集合。

EVEREST: a collection of evolutionary conserved protein domains.

作者信息

Portugaly Elon, Linial Nathan, Linial Michal

机构信息

School of Computer Science & Engineering, Institute of Life Sciences, The Hebrew University of Jerusalem.

出版信息

Nucleic Acids Res. 2007 Jan;35(Database issue):D241-6. doi: 10.1093/nar/gkl850. Epub 2006 Nov 11.

Abstract

Protein domains are subunits of proteins that recur throughout the protein world. There are many definitions attempting to capture the essence of a protein domain, and several systems that identify protein domains and classify them into families. EVEREST, recently described in Portugaly et al. (2006) BMC Bioinformatics, 7, 277, is one such system that performs the task automatically, using protein sequence alone. Herein we describe EVEREST release 2.0, consisting of 20,029 families, each defined by one or more HMMs. The current EVEREST database was constructed by scanning UniProt 8.1 and all PDB sequences (total over 3,000,000 sequences) with each of the EVEREST families. EVEREST annotates 64% of all sequences, and covers 59% of all residues. EVEREST is available at http://www.everest.cs.huji.ac.il/. The website provides annotations given by SCOP, CATH, Pfam A and EVEREST. It allows for browsing through the families of each of those sources, graphically visualizing the domain organization of the proteins in the family. The website also provides access to analyzes of relationships between domain families, within and across domain definition systems. Users can upload sequences for analysis by the set of EVEREST families. Finally an advanced search form allows querying for families matching criteria regarding novelty, phylogenetic composition and more.

摘要

蛋白质结构域是在整个蛋白质世界中反复出现的蛋白质亚基。有许多定义试图抓住蛋白质结构域的本质,还有几个识别蛋白质结构域并将其分类为家族的系统。EVEREST是葡萄牙等人(2006年,《BMC生物信息学》,7卷,277页)最近描述的一个这样的系统,它仅使用蛋白质序列自动执行该任务。在此我们描述EVEREST 2.0版本,它由20,029个家族组成,每个家族由一个或多个隐马尔可夫模型(HMM)定义。当前的EVEREST数据库是通过用EVEREST的每个家族扫描UniProt 8.1和所有PDB序列(总共超过3,000,000个序列)构建的。EVEREST注释了所有序列的64%,覆盖了所有残基的59%。EVEREST可在http://www.everest.cs.huji.ac.il/获取。该网站提供了SCOP、CATH、Pfam A和EVEREST给出的注释。它允许浏览这些来源中每个的家族,以图形方式可视化家族中蛋白质的结构域组织。该网站还提供对结构域家族之间关系分析的访问,包括在结构域定义系统内部和跨系统的分析。用户可以上传序列以供EVEREST家族集进行分析。最后,一个高级搜索表单允许查询符合关于新颖性、系统发育组成等标准的家族。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2e2/1781175/8d26dfa37dbf/gkl850f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验