Suppr超能文献

ECOD数据库中的手动分类策略。

Manual classification strategies in the ECOD database.

作者信息

Cheng Hua, Liao Yuxing, Schaeffer R Dustin, Grishin Nick V

机构信息

Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390.

Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, 75390.

出版信息

Proteins. 2015 Jul;83(7):1238-51. doi: 10.1002/prot.24818. Epub 2015 May 8.

Abstract

ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up-to-date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi-domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert-driven analysis.

摘要

ECOD(蛋白质结构域进化分类数据库)是一个全面且最新的蛋白质结构分类数据库。每周从蛋白质数据库(PDB)发布的大多数新结构在ECOD层次结构中已经有密切的同源物,因此可以在无需人工干预的情况下通过软件可靠地划分为结构域并进行分类。然而,那些缺乏可置信检测到的同源物的蛋白质需要专家进行仔细分析。尽管许多生物信息学资源在一定程度上依赖专家编目,但这种编目如何进行以及在哪些情况下是必要的具体例子并不总是有描述。在这里,我们通过实例说明ECOD中的人工分类策略,重点关注蛋白质分类中的两个主要问题:结构域划分以及同源性与相似性得分之间的关系。大多数例子展示了最近发布并经过人工分类的PDB结构。我们讨论了多结构域蛋白质、序列与结构相似性之间的不一致、用得分评估同源性的困难以及与可溶性蛋白质同源的整合膜蛋白。通过及时将新获得的结构纳入其层次结构,ECOD努力通过计算分析和专家驱动分析相结合,提供蛋白质结构世界最准确和最新的视图。

相似文献

1
Manual classification strategies in the ECOD database.ECOD数据库中的手动分类策略。
Proteins. 2015 Jul;83(7):1238-51. doi: 10.1002/prot.24818. Epub 2015 May 8.
3
ECOD: an evolutionary classification of protein domains.ECOD:蛋白质结构域的进化分类
PLoS Comput Biol. 2014 Dec 4;10(12):e1003926. doi: 10.1371/journal.pcbi.1003926. eCollection 2014 Dec.
4
ECOD: new developments in the evolutionary classification of domains.ECOD:结构域进化分类的新进展
Nucleic Acids Res. 2017 Jan 4;45(D1):D296-D302. doi: 10.1093/nar/gkw1137. Epub 2016 Nov 29.
5
A sequence family database built on ECOD structural domains.基于 ECOD 结构域构建的序列家族数据库。
Bioinformatics. 2018 Sep 1;34(17):2997-3003. doi: 10.1093/bioinformatics/bty214.
7
CASP 11 target classification.CASP 11目标分类。
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):20-33. doi: 10.1002/prot.24982. Epub 2016 Jan 27.
10
DBAli tools: mining the protein structure space.DBAli工具:挖掘蛋白质结构空间
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W393-7. doi: 10.1093/nar/gkm236. Epub 2007 May 3.

引用本文的文献

本文引用的文献

1
ECOD: an evolutionary classification of protein domains.ECOD:蛋白质结构域的进化分类
PLoS Comput Biol. 2014 Dec 4;10(12):e1003926. doi: 10.1371/journal.pcbi.1003926. eCollection 2014 Dec.
2
Structure of a membrane-embedded prenyltransferase homologous to UBIAD1.一种与UBIAD1同源的膜嵌入异戊二烯基转移酶的结构。
PLoS Biol. 2014 Jul 22;12(7):e1001911. doi: 10.1371/journal.pbio.1001911. eCollection 2014 Jul.
8
Pfam: the protein families database.Pfam:蛋白质家族数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 Nov 27.
10
Cyclic di-AMP: another second messenger enters the fray.环状二腺苷酸:另一个第二信使加入战局。
Nat Rev Microbiol. 2013 Aug;11(8):513-24. doi: 10.1038/nrmicro3069. Epub 2013 Jul 1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验