IS-Dom：一个从蛋白质结构中自动划分的独立结构域数据集。

IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.

机构信息

Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 12-24-16 Nakamachi, Koganei-shi, Tokyo 184-8588, Japan.

出版信息

J Comput Aided Mol Des. 2013 May;27(5):419-26. doi: 10.1007/s10822-013-9654-6. Epub 2013 May 29.

DOI:10.1007/s10822-013-9654-6

PMID:23715893

Abstract

Protein domains that can fold in isolation are significant targets in diverse area of proteomics research as they are often readily analyzed by high-throughput methods. Here, we report IS-Dom, a dataset of Independent Structural Domains (ISDs) that are most likely to fold in isolation. IS-Dom was constructed by filtering domains from SCOP, CATH, and DomainParser using quantitative structural measures, which were calculated by estimating inter-domain hydrophobic clusters and hydrogen bonds from the full length protein's atomic coordinates. The ISD detection protocol is fully automated, and all of the computed interactions are stored in the server which enables rapid update of IS-Dom. We also prepared a standard IS-Dom using parameters optimized by maximizing the Youden's index. The standard IS-Dom, contained 54,860 ISDs, of which 25.5 % had high sequence identity and termini overlap with a Protein Data Bank (PDB) cataloged sequence and are thus experimentally shown to fold in isolation [coined autonomously folded domain (AFDs)]. Furthermore, our ISD detection protocol missed less than 10 % of the AFDs, which corroborated our protocol's ability to define structural domains that are able to fold independently. IS-Dom is available through the web server ( http://domserv.lab.tuat.ac.jp/IS-Dom.html ), and users can either, download the standard IS-Dom dataset, construct their own IS-Dom by interactively varying the parameters, or assess the structural independence of newly defined putative domains.

摘要

能够独立折叠的蛋白质结构域是蛋白质组学研究中许多领域的重要目标，因为它们通常可以通过高通量方法进行分析。在这里，我们报告了 IS-Dom，这是一个独立结构域（ISD）数据集，这些结构域最有可能独立折叠。IS-Dom 是通过使用定量结构测量值从 SCOP、CATH 和 DomainParser 中过滤结构域来构建的，这些测量值是通过从全长蛋白质的原子坐标估计结构域间的疏水区簇和氢键来计算的。ISD 检测协议是完全自动化的，并且所有计算出的相互作用都存储在服务器中，这使得 IS-Dom 能够快速更新。我们还使用通过最大化约登指数优化的参数准备了一个标准的 IS-Dom。标准的 IS-Dom 包含 54860 个 ISD，其中 25.5%具有与蛋白质数据库 (PDB) 编目序列的高序列同一性和末端重叠，并且因此实验证明可以独立折叠[称为自主折叠结构域 (AFD)]。此外，我们的 ISD 检测协议错过了不到 10%的 AFD，这证实了我们的协议能够定义能够独立折叠的结构域的能力。IS-Dom 可通过网络服务器（http://domserv.lab.tuat.ac.jp/IS-Dom.html）获得，用户可以下载标准的 IS-Dom 数据集，通过交互改变参数来构建自己的 IS-Dom，或者评估新定义的假定结构域的结构独立性。

相似文献

IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.IS-Dom：一个从蛋白质结构中自动划分的独立结构域数据集。

J Comput Aided Mol Des. 2013 May;27(5):419-26. doi: 10.1007/s10822-013-9654-6. Epub 2013 May 29.

H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection.H-DROP：一种基于支持向量机的螺旋结构域连接子预测器，通过结合随机森林和逐步选择优化特征进行训练。

J Comput Aided Mol Des. 2014 Aug;28(8):831-9. doi: 10.1007/s10822-014-9763-x. Epub 2014 Jun 26.

DDOMAIN: Dividing structures into domains using a normalized domain-domain interaction profile.DDOMAIN：使用归一化的域-域相互作用谱将结构划分为不同的域。

Protein Sci. 2007 May;16(5):947-55. doi: 10.1110/ps.062597307.

DescFold: a web server for protein fold recognition.DescFold：用于蛋白质折叠识别的网络服务器。

BMC Bioinformatics. 2009 Dec 14;10:416. doi: 10.1186/1471-2105-10-416.

The ProFunc Function Prediction Server.ProFunc功能预测服务器。

Methods Mol Biol. 2017;1611:75-95. doi: 10.1007/978-1-4939-7015-5_7.

AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings.AutoSCOP：使用独特的模式-类别映射自动预测SCOP分类

Bioinformatics. 2007 May 15;23(10):1203-10. doi: 10.1093/bioinformatics/btm089. Epub 2007 Mar 22.

fastSCOP: a fast web server for recognizing protein structural domains and SCOP superfamilies.fastSCOP：一个用于识别蛋白质结构域和SCOP超家族的快速网络服务器。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W438-43. doi: 10.1093/nar/gkm288. Epub 2007 May 7.

Recognizing the fold of a protein structure.识别蛋白质结构的折叠。

Bioinformatics. 2003 Sep 22;19(14):1748-59. doi: 10.1093/bioinformatics/btg240.

OPUS-Dom: applying the folding-based method VECFOLD to determine protein domain boundaries.OPUS-Dom：应用基于折叠的方法VECFOLD来确定蛋白质结构域边界。

J Mol Biol. 2009 Jan 30;385(4):1314-29. doi: 10.1016/j.jmb.2008.10.093. Epub 2008 Nov 10.

S4: structure-based sequence alignments of SCOP superfamilies.S4：SCOP超家族基于结构的序列比对。

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D219-22. doi: 10.1093/nar/gki043.

引用本文的文献

Design and Expression of a Natively Folded Multi-Disulfide Bonded Influenza H1N1-PR8 Receptor-Binding Domain (RBD).设计与表达天然折叠的含多个二硫键的流感 H1N1-PR8 受体结合域（RBD）。

Int J Mol Sci. 2024 Apr 1;25(7):3943. doi: 10.3390/ijms25073943.

Fast H-DROP: A thirty times accelerated version of H-DROP for interactive SVM-based prediction of helical domain linkers.快速H-DROP：H-DROP的30倍加速版本，用于基于支持向量机的螺旋结构域连接子的交互式预测。

J Comput Aided Mol Des. 2017 Feb;31(2):237-244. doi: 10.1007/s10822-016-9999-8. Epub 2016 Dec 27.

J Comput Aided Mol Des. 2014 Aug;28(8):831-9. doi: 10.1007/s10822-014-9763-x. Epub 2014 Jun 26.

本文引用的文献

HydroPaCe: understanding and predicting cross-inhibition in serine proteases through hydrophobic patch centroids.HydroPaCe：通过疏水斑质心理解和预测丝氨酸蛋白酶的交叉抑制作用。

Bioinformatics. 2012 Feb 1;28(3):342-9. doi: 10.1093/bioinformatics/btr680. Epub 2011 Dec 9.

The Pfam protein families database.Pfam 蛋白质家族数据库。

Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. doi: 10.1093/nar/gkr1065. Epub 2011 Nov 29.

Functional specialization in nucleotide sugar transporters occurred through differentiation of the gene cluster EamA (DUF6) before the radiation of Viridiplantae.核苷酸糖转运蛋白的功能特化是通过 EamA（DUF6）基因簇在植物辐射分化之前发生的。

BMC Evol Biol. 2011 May 12;11:123. doi: 10.1186/1471-2148-11-123.

DROP: an SVM domain linker predictor trained with optimal features selected by random forest.DROP：一种使用随机森林选择的最佳特征训练的 SVM 域链接器预测器。

Bioinformatics. 2011 Feb 15;27(4):487-94. doi: 10.1093/bioinformatics/btq700. Epub 2010 Dec 17.

Mathematical model for empirically optimizing large scale production of soluble protein domains.用于经验优化可溶性蛋白结构域大规模生产的数学模型。

BMC Bioinformatics. 2010 Mar 1;11:113. doi: 10.1186/1471-2105-11-113.

Loop-length-dependent SVM prediction of domain linkers for high-throughput structural proteomics.用于高通量结构蛋白质组学的结构域连接子的环长依赖性支持向量机预测

Biopolymers. 2009;92(1):1-8. doi: 10.1002/bip.21105.

Data growth and its impact on the SCOP database: new developments.数据增长及其对SCOP数据库的影响：新进展

Nucleic Acids Res. 2008 Jan;36(Database issue):D419-25. doi: 10.1093/nar/gkm993. Epub 2007 Nov 13.

DDOMAIN: Dividing structures into domains using a normalized domain-domain interaction profile.DDOMAIN：使用归一化的域-域相互作用谱将结构划分为不同的域。

Protein Sci. 2007 May;16(5):947-55. doi: 10.1110/ps.062597307.

The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution.CATH结构域结构数据库：新协议和分类级别为探索进化提供了更全面的资源。

Nucleic Acids Res. 2007 Jan;35(Database issue):D291-7. doi: 10.1093/nar/gkl959. Epub 2006 Nov 29.

Identification of putative domain linkers by a neural network - application to a large sequence database.通过神经网络识别假定的结构域连接子——应用于大型序列数据库

BMC Bioinformatics. 2006 Jun 27;7:323. doi: 10.1186/1471-2105-7-323.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

IS-Dom：一个从蛋白质结构中自动划分的独立结构域数据集。

IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献