DDOMAIN：使用归一化的域-域相互作用谱将结构划分为不同的域。

DDOMAIN: Dividing structures into domains using a normalized domain-domain interaction profile.

作者信息

Zhou Hongyi, Xue Bin, Zhou Yaoqi

机构信息

Howard Hughes Medical Institute Center for Single Molecule Biophysics, Department of Physiology and Biophysics, State University of New York at Buffalo, Buffalo, New York 14214, USA.

出版信息

Protein Sci. 2007 May;16(5):947-55. doi: 10.1110/ps.062597307.

DOI:10.1110/ps.062597307

PMID:17456745

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2206635/

Abstract

Dividing protein structures into domains is proven useful for more accurate structural and functional characterization of proteins. Here, we develop a method, called DDOMAIN, that divides structure into DOMAINs using a normalized contact-based domain-domain interaction profile. Results of DDOMAIN are compared to AUTHORS annotations (domain definitions are given by the authors who solved protein structures), as well as to popular SCOP and CATH annotations by human experts and automatic programs. DDOMAIN's automatic annotations are most consistent with the AUTHORS annotations (90% agreement in number of domains and 88% agreement in both number of domains and at least 85% overlap in domain assignment of residues) if its three adjustable parameters are trained by the AUTHORS annotations. By comparison, the agreement is 83% (81% with at least 85% overlap criterion) between SCOP-trained DDOMAIN and SCOP annotations and 77% (73%) between CATH-trained DDOMAIN and CATH annotations. The agreement between DDOMAIN and AUTHORS annotations goes beyond single-domain proteins (97%, 82%, and 56% for single-, two-, and three-domain proteins, respectively). For an "easy" data set of proteins whose CATH and SCOP annotations agree with each other in number of domains, the agreement is 90% (89%) between "easy-set"-trained DDOMAIN and CATH/SCOP annotations. The consistency between SCOP-trained DDOMAIN and SCOP annotations is superior to two other recently developed, SCOP-trained, automatic methods PDP (protein domain parser), and DomainParser 2. We also tested a simple consensus method made of PDP, DomainParser 2, and DDOMAIN and a different version of DDOMAIN based on a more sophisticated statistical energy function. The DDOMAIN server and its executable are available in the services section on http://sparks.informatics.iupui.edu.

摘要

将蛋白质结构划分为结构域已被证明有助于更准确地对蛋白质进行结构和功能表征。在此，我们开发了一种名为DDOMAIN的方法，该方法使用基于归一化接触的结构域-结构域相互作用图谱将结构划分为结构域。将DDOMAIN的结果与作者注释（结构域定义由解析蛋白质结构的作者给出）以及人类专家和自动程序给出的流行的SCOP和CATH注释进行比较。如果DDOMAIN的三个可调参数通过作者注释进行训练，那么其自动注释与作者注释最为一致（结构域数量的一致性为90%，结构域数量和残基结构域分配中至少85%重叠的一致性为88%）。相比之下，经SCOP训练的DDOMAIN与SCOP注释之间的一致性为83%（采用至少85%重叠标准时为81%），经CATH训练的DDOMAIN与CATH注释之间的一致性为77%（73%）。DDOMAIN与作者注释之间的一致性不仅适用于单结构域蛋白质（单结构域、双结构域和三结构域蛋白质的一致性分别为97%、82%和56%）。对于一个“简单”的蛋白质数据集，其CATH和SCOP注释在结构域数量上相互一致，经“简单数据集”训练的DDOMAIN与CATH/SCOP注释之间的一致性为90%（89%）。经SCOP训练的DDOMAIN与SCOP注释之间的一致性优于另外两种最近开发的、经SCOP训练的自动方法PDP（蛋白质结构域解析器）和DomainParser 2。我们还测试了一种由PDP、DomainParser 2和DDOMAIN组成的简单共识方法，以及基于更复杂统计能量函数的不同版本的DDOMAIN。DDOMAIN服务器及其可执行文件可在http://sparks.informatics.iupui.edu的服务部分获取。

相似文献

DDOMAIN: Dividing structures into domains using a normalized domain-domain interaction profile.DDOMAIN：使用归一化的域-域相互作用谱将结构划分为不同的域。

Protein Sci. 2007 May;16(5):947-55. doi: 10.1110/ps.062597307.

Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.离散与连续蛋白质结构空间之间的交叉：对蛋白质结构自动分类及网络的见解。

PLoS Comput Biol. 2009 Mar;5(3):e1000331. doi: 10.1371/journal.pcbi.1000331. Epub 2009 Mar 27.

Toward consistent assignment of structural domains in proteins.迈向蛋白质结构域的一致分配

J Mol Biol. 2004 Jun 4;339(3):647-78. doi: 10.1016/j.jmb.2004.03.053.

Partitioning protein structures into domains: why is it so difficult?将蛋白质结构划分为结构域：为何如此困难？

J Mol Biol. 2006 Aug 18;361(3):562-90. doi: 10.1016/j.jmb.2006.05.060. Epub 2006 Jun 22.

Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis.SCOP与CATH的系统比较：蛋白质结构分析的新金标准。

BMC Struct Biol. 2009 Apr 17;9:23. doi: 10.1186/1472-6807-9-23.

PDP: protein domain parser.PDP：蛋白质结构域解析器。

Bioinformatics. 2003 Feb 12;19(3):429-30. doi: 10.1093/bioinformatics/btg006.

dConsensus: a tool for displaying domain assignments by multiple structure-based algorithms and for construction of a consensus assignment.共识：一种用于显示基于多种结构算法的结构域分配的工具，以及一种用于构建共识分配的工具。

BMC Bioinformatics. 2010 Jun 9;11:310. doi: 10.1186/1471-2105-11-310.

A comparison of SCOP and CATH with respect to domain-domain interactions.SCOP与CATH在结构域间相互作用方面的比较。

Proteins. 2008 Jan 1;70(1):54-62. doi: 10.1002/prot.21496.

fastSCOP: a fast web server for recognizing protein structural domains and SCOP superfamilies.fastSCOP：一个用于识别蛋白质结构域和SCOP超家族的快速网络服务器。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W438-43. doi: 10.1093/nar/gkm288. Epub 2007 May 7.

AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings.AutoSCOP：使用独特的模式-类别映射自动预测SCOP分类

Bioinformatics. 2007 May 15;23(10):1203-10. doi: 10.1093/bioinformatics/btm089. Epub 2007 Mar 22.

引用本文的文献

CASP16 protein monomer structure prediction assessment.半胱天冬酶16（CASP16）蛋白单体结构预测评估

bioRxiv. 2025 Jun 2:2025.05.29.656942. doi: 10.1101/2025.05.29.656942.

Hierarchical Analysis of Protein Structures: From Secondary Structures to Protein Units and Domains.蛋白质结构的层次分析：从二级结构到蛋白质单元和结构域。

Methods Mol Biol. 2025;2870:357-370. doi: 10.1007/978-1-0716-4213-9_18.

Solving protein structures by combining structure prediction, molecular replacement and direct-methods-aided model completion.通过组合结构预测、分子置换和直接法辅助模型构建来解决蛋白质结构问题。

IUCrJ. 2024 Mar 1;11(Pt 2):152-167. doi: 10.1107/S2052252523010291.

Identification of a Putative SARS-CoV-2 Main Protease Inhibitor through In Silico Screening of Self-Designed Molecular Library.通过自行设计的分子文库的计算机筛选鉴定出一种潜在的 SARS-CoV-2 主要蛋白酶抑制剂。

Int J Mol Sci. 2023 Jul 13;24(14):11390. doi: 10.3390/ijms241411390.

To split or not to split: CASP15 targets and their processing into tertiary structure evaluation units.要分割还是不分割：CASP15 目标及其处理为三级结构评估单元。

Proteins. 2023 Dec;91(12):1558-1570. doi: 10.1002/prot.26533. Epub 2023 May 31.

A unified approach to protein domain parsing with inter-residue distance matrix.基于残基间距离矩阵的蛋白质结构域解析的统一方法

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad070.

Res-Dom: predicting protein domain boundary from sequence using deep residual network and Bi-LSTM.Res-Dom：使用深度残差网络和双向长短期记忆网络从序列预测蛋白质结构域边界

Bioinform Adv. 2022 Sep 1;2(1):vbac060. doi: 10.1093/bioadv/vbac060. eCollection 2022.

Assignment of structural domains in proteins using diffusion kernels on graphs.使用图上的扩散核来分配蛋白质中的结构域。

BMC Bioinformatics. 2022 Sep 8;23(1):369. doi: 10.1186/s12859-022-04902-9.

Target classification in the 14th round of the critical assessment of protein structure prediction (CASP14).第 14 轮蛋白质结构预测关键评估（CASP14）中的目标分类。

Proteins. 2021 Dec;89(12):1618-1632. doi: 10.1002/prot.26202. Epub 2021 Aug 19.

Protein domain identification methods and online resources.蛋白质结构域鉴定方法及在线资源。

Comput Struct Biotechnol J. 2021 Feb 2;19:1145-1153. doi: 10.1016/j.csbj.2021.01.041. eCollection 2021.

本文引用的文献

Partitioning protein structures into domains: why is it so difficult?将蛋白质结构划分为结构域：为何如此困难？

J Mol Biol. 2006 Aug 18;361(3):562-90. doi: 10.1016/j.jmb.2006.05.060. Epub 2006 Jun 22.

DIAL: a web-based server for the automatic identification of structural domains in proteins.DIAL：一个用于自动识别蛋白质结构域的基于网络的服务器。

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W130-2. doi: 10.1093/nar/gki427.

PPRODO: prediction of protein domain boundaries using neural networks.PPRODO：使用神经网络预测蛋白质结构域边界

Proteins. 2005 May 15;59(3):627-32. doi: 10.1002/prot.20442.

Automatic domain decomposition of proteins by a Gaussian Network Model.基于高斯网络模型的蛋白质自动结构域分解

Proteins. 2004 Dec 1;57(4):725-33. doi: 10.1002/prot.20268.

Sequence-based prediction of protein domains.基于序列的蛋白质结构域预测。

Nucleic Acids Res. 2004 Jul 7;32(12):3522-30. doi: 10.1093/nar/gkh684. Print 2004.

Toward consistent assignment of structural domains in proteins.迈向蛋白质结构域的一致分配

J Mol Biol. 2004 Jun 4;339(3):647-78. doi: 10.1016/j.jmb.2004.03.053.

Arby: automatic protein structure prediction using profile-profile alignment and confidence measures.Arby：利用轮廓-轮廓比对和置信度测量进行自动蛋白质结构预测。

Bioinformatics. 2004 Sep 22;20(14):2228-35. doi: 10.1093/bioinformatics/bth232. Epub 2004 Apr 1.

Automatic prediction of protein domains from sequence information using a hybrid learning system.使用混合学习系统从序列信息中自动预测蛋白质结构域。

Bioinformatics. 2004 Jun 12;20(9):1335-60. doi: 10.1093/bioinformatics/bth086. Epub 2004 Feb 12.

An accurate, residue-level, pair potential of mean force for folding and binding based on the distance-scaled, ideal-gas reference state.一种基于距离缩放的理想气体参考态的、用于折叠和结合的精确到残基水平的平均力对势。

Protein Sci. 2004 Feb;13(2):400-11. doi: 10.1110/ps.03348304.

SMART 4.0: towards genomic data integration.SMART 4.0：迈向基因组数据整合

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D142-4. doi: 10.1093/nar/gkh088.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验