Miyazaki Satoshi, Kuroda Yutaka, Yokoyama Shigeyuki
Department of Biophysics and Biochemistry, Graduate School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.
J Struct Funct Genomics. 2002;2(1):37-51. doi: 10.1023/a:1014418700858.
In this paper, we describe a neural network analysis of sequences connecting two protein domains (domain linkers). The neural network was trained to distinguish between domain linker sequences and non-linker sequences, using a SCOP-defined domain library. The analysis indicated that a significant difference existed between domain linkers and non-linker regions, including intra-domain loop regions. Moreover, the resulting Hinton diagram showed a position-dependent amino acid preference of the domain linker sequences, and implied their non-random nature. We then applied the neural network to predict domain linkers in multi-domain protein sequences. As the result of a Jack-knife test, 58% of the predicted regions matched actual linker regions (specificity), and 36% of the SCOP-derived domain linkers were predicted (sensitivity). This prediction efficiency is superior to simpler methods derived from secondary structure prediction that assume that long loop regions are putative domain linkers. Altogether, these results suggest that domain linkers possess local characteristics different from those of loop regions.
在本文中,我们描述了对连接两个蛋白质结构域(结构域连接子)的序列进行的神经网络分析。使用SCOP定义的结构域库训练神经网络,以区分结构域连接子序列和非连接子序列。分析表明,结构域连接子与非连接子区域(包括结构域内的环区域)之间存在显著差异。此外,生成的辛顿图显示了结构域连接子序列中氨基酸偏好的位置依赖性,并暗示了它们的非随机性质。然后,我们应用该神经网络预测多结构域蛋白质序列中的结构域连接子。作为留一法检验的结果,58%的预测区域与实际连接子区域匹配(特异性),并且预测出了36%源自SCOP的结构域连接子(敏感性)。这种预测效率优于基于二级结构预测的更简单方法,后者假定长环区域为假定的结构域连接子。总之,这些结果表明结构域连接子具有与环区域不同的局部特征。