利用神经网络对多结构域蛋白质的连接子序列进行表征和预测。

Characterization and prediction of linker sequences of multi-domain proteins by a neural network.

作者信息

Miyazaki Satoshi, Kuroda Yutaka, Yokoyama Shigeyuki

机构信息

Department of Biophysics and Biochemistry, Graduate School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.

出版信息

J Struct Funct Genomics. 2002;2(1):37-51. doi: 10.1023/a:1014418700858.

DOI:10.1023/a:1014418700858

PMID:12836673

Abstract

In this paper, we describe a neural network analysis of sequences connecting two protein domains (domain linkers). The neural network was trained to distinguish between domain linker sequences and non-linker sequences, using a SCOP-defined domain library. The analysis indicated that a significant difference existed between domain linkers and non-linker regions, including intra-domain loop regions. Moreover, the resulting Hinton diagram showed a position-dependent amino acid preference of the domain linker sequences, and implied their non-random nature. We then applied the neural network to predict domain linkers in multi-domain protein sequences. As the result of a Jack-knife test, 58% of the predicted regions matched actual linker regions (specificity), and 36% of the SCOP-derived domain linkers were predicted (sensitivity). This prediction efficiency is superior to simpler methods derived from secondary structure prediction that assume that long loop regions are putative domain linkers. Altogether, these results suggest that domain linkers possess local characteristics different from those of loop regions.

摘要

在本文中，我们描述了对连接两个蛋白质结构域（结构域连接子）的序列进行的神经网络分析。使用SCOP定义的结构域库训练神经网络，以区分结构域连接子序列和非连接子序列。分析表明，结构域连接子与非连接子区域（包括结构域内的环区域）之间存在显著差异。此外，生成的辛顿图显示了结构域连接子序列中氨基酸偏好的位置依赖性，并暗示了它们的非随机性质。然后，我们应用该神经网络预测多结构域蛋白质序列中的结构域连接子。作为留一法检验的结果，58%的预测区域与实际连接子区域匹配（特异性），并且预测出了36%源自SCOP的结构域连接子（敏感性）。这种预测效率优于基于二级结构预测的更简单方法，后者假定长环区域为假定的结构域连接子。总之，这些结果表明结构域连接子具有与环区域不同的局部特征。

相似文献

Characterization and prediction of linker sequences of multi-domain proteins by a neural network.利用神经网络对多结构域蛋白质的连接子序列进行表征和预测。

J Struct Funct Genomics. 2002;2(1):37-51. doi: 10.1023/a:1014418700858.

Identification of putative domain linkers by a neural network - application to a large sequence database.通过神经网络识别假定的结构域连接子——应用于大型序列数据库

BMC Bioinformatics. 2006 Jun 27;7:323. doi: 10.1186/1471-2105-7-323.

DomCut: prediction of inter-domain linker regions in amino acid sequences.DomCut：氨基酸序列中结构域间连接区的预测

Bioinformatics. 2003 Mar 22;19(5):673-4. doi: 10.1093/bioinformatics/btg031.

Prediction of protein interdomain linker regions by a hidden Markov model.利用隐马尔可夫模型预测蛋白质结构域间连接区域

Bioinformatics. 2005 May 15;21(10):2264-70. doi: 10.1093/bioinformatics/bti363. Epub 2005 Mar 3.

Characteristics and prediction of domain linker sequences in multi-domain proteins.多结构域蛋白中结构域连接子序列的特征与预测

J Struct Funct Genomics. 2003;4(2-3):79-85. doi: 10.1023/a:1026163008203.

Armadillo: domain boundary prediction by amino acid composition.犰狳：基于氨基酸组成的结构域边界预测

J Mol Biol. 2005 Jul 29;350(5):1061-73. doi: 10.1016/j.jmb.2005.05.037.

Improvement of domain linker prediction by incorporating loop-length-dependent characteristics.通过纳入环长度依赖性特征改进结构域连接子预测。

Biopolymers. 2006;84(2):161-8. doi: 10.1002/bip.20361.

Domain boundary prediction based on profile domain linker propensity index.基于序列轮廓结构域连接子倾向指数的结构域边界预测

Comput Biol Chem. 2006 Apr;30(2):127-33. doi: 10.1016/j.compbiolchem.2006.01.001. Epub 2006 Mar 13.

Protein inter-domain linker prediction using Random Forest and amino acid physiochemical properties.利用随机森林和氨基酸理化性质进行蛋白质结构域间连接子预测。

BMC Bioinformatics. 2014;15 Suppl 16(Suppl 16):S8. doi: 10.1186/1471-2105-15-S16-S8. Epub 2014 Dec 8.

Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network.蛋白质中碳水化合物结合的序列和结构特征以及使用神经网络评估可预测性

BMC Struct Biol. 2007 Jan 3;7:1. doi: 10.1186/1472-6807-7-1.

引用本文的文献

Protein crystallization: Eluding the bottleneck of X-ray crystallography.蛋白质结晶：突破X射线晶体学的瓶颈

AIMS Biophys. 2017;4(4):557-575. doi: 10.3934/biophy.2017.4.557. Epub 2017 Sep 26.

Fast H-DROP: A thirty times accelerated version of H-DROP for interactive SVM-based prediction of helical domain linkers.快速H-DROP：H-DROP的30倍加速版本，用于基于支持向量机的螺旋结构域连接子的交互式预测。

J Comput Aided Mol Des. 2017 Feb;31(2):237-244. doi: 10.1007/s10822-016-9999-8. Epub 2016 Dec 27.

H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection.H-DROP：一种基于支持向量机的螺旋结构域连接子预测器，通过结合随机森林和逐步选择优化特征进行训练。

J Comput Aided Mol Des. 2014 Aug;28(8):831-9. doi: 10.1007/s10822-014-9763-x. Epub 2014 Jun 26.

IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.IS-Dom：一个从蛋白质结构中自动划分的独立结构域数据集。

J Comput Aided Mol Des. 2013 May;27(5):419-26. doi: 10.1007/s10822-013-9654-6. Epub 2013 May 29.

Structural determinants at the interface of the ARC2 and leucine-rich repeat domains control the activation of the plant immune receptors Rx1 and Gpa2.结构决定因素在 ARC2 和富含亮氨酸重复结构域的界面控制植物免疫受体 Rx1 和 Gpa2 的激活。

Plant Physiol. 2013 Jul;162(3):1510-28. doi: 10.1104/pp.113.218842. Epub 2013 May 9.

Ancient diversity of splicing motifs and protein surfaces in the wild emmer wheat (Triticum dicoccoides) LR10 coiled coil (CC) and leucine-rich repeat (LRR) domains.野生二粒小麦（Triticum dicoccoides）LR10 卷曲螺旋（CC）和富含亮氨酸重复（LRR）结构域中剪接基序和蛋白质表面的古老多样性。

Mol Plant Pathol. 2012 Apr;13(3):276-87. doi: 10.1111/j.1364-3703.2011.00744.x. Epub 2011 Sep 23.

Structural features specific to plant metallothioneins.植物金属硫蛋白的结构特征。

J Biol Inorg Chem. 2011 Oct;16(7):1035-45. doi: 10.1007/s00775-011-0801-z. Epub 2011 Jun 19.

C-terminus glycans with critical functional role in the maturation of secretory glycoproteins.C 末端糖基在分泌糖蛋白成熟过程中具有关键功能作用。

PLoS One. 2011;6(5):e19979. doi: 10.1371/journal.pone.0019979. Epub 2011 May 18.

Nucleocytoplasmic distribution is required for activation of resistance by the potato NB-LRR receptor Rx1 and is balanced by its functional domains.核质分布是马铃薯 NB-LRR 受体 Rx1 激活抗性所必需的，并且由其功能域来平衡。

Plant Cell. 2010 Dec;22(12):4195-215. doi: 10.1105/tpc.110.077537. Epub 2010 Dec 21.

Mathematical model for empirically optimizing large scale production of soluble protein domains.用于经验优化可溶性蛋白结构域大规模生产的数学模型。

BMC Bioinformatics. 2010 Mar 1;11:113. doi: 10.1186/1471-2105-11-113.

本文引用的文献

Raster3D Version 2.0. A program for photorealistic molecular graphics.光栅3D版本2.0。一个用于逼真分子图形的程序。

Acta Crystallogr D Biol Crystallogr. 1994 Nov 1;50(Pt 6):869-73. doi: 10.1107/S0907444994006396.

Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics.在结构基因组学中自动搜索天然折叠的蛋白质片段以进行高通量结构测定。

Protein Sci. 2000 Dec;9(12):2313-21. doi: 10.1110/ps.9.12.2313.

Sequence complexity of disordered protein.无序蛋白质的序列复杂性

Proteins. 2001 Jan 1;42(1):38-48. doi: 10.1002/1097-0134(20010101)42:1<38::aid-prot50>3.0.co;2-3.

Domain size distributions can predict domain boundaries.畴尺寸分布可以预测畴界。

Bioinformatics. 2000 Jul;16(7):613-8. doi: 10.1093/bioinformatics/16.7.613.

NMR spectroscopy of large molecules and multimolecular assemblies in solution.溶液中大分子和多分子聚集体的核磁共振光谱学。

Curr Opin Struct Biol. 1999 Oct;9(5):594-601. doi: 10.1016/s0959-440x(99)00011-1.

Progress in protein structure prediction: assessment of CASP3.蛋白质结构预测的进展：CASP3评估

Curr Opin Struct Biol. 1999 Jun;9(3):368-73. doi: 10.1016/S0959-440X(99)80050-5.

Prediction of the location and type of beta-turns in proteins using neural networks.使用神经网络预测蛋白质中β-转角的位置和类型。

Protein Sci. 1999 May;8(5):1045-55. doi: 10.1110/ps.8.5.1045.

Structural analyses of CREB-CBP transcriptional activator-coactivator complexes by NMR spectroscopy: implications for mapping the boundaries of structural domains.利用核磁共振光谱对CREB-CBP转录激活因子-共激活因子复合物进行结构分析：对绘制结构域边界的启示

J Mol Biol. 1999 Apr 16;287(5):859-65. doi: 10.1006/jmbi.1999.2658.

Prediction and classification of domain structural classes.结构域结构类别的预测与分类。

Proteins. 1998 Apr 1;31(1):97-103.

The ProDom database of protein domain families.蛋白质结构域家族的ProDom数据库。

Nucleic Acids Res. 1998 Jan 1;26(1):323-6. doi: 10.1093/nar/26.1.323.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用神经网络对多结构域蛋白质的连接子序列进行表征和预测。

Characterization and prediction of linker sequences of multi-domain proteins by a neural network.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献