Yuan Zheng, Wang Zhi-Xin
Institute for Molecular Bioscience and ARC Centre in Bioinformatics, The University of Queensland, Brisbane, Australia.
Proteins. 2008 Feb 1;70(2):509-16. doi: 10.1002/prot.21545.
Protein burying depth (BD) is a structural descriptor that is exploited not only to find whether a residue is exposed or buried, but also to determine how deep a residue is buried. The widely used solvent accessible surface area is mainly focusing on the study of protein surface residues, while protein BD can provide more detailed information about the arrangement of buried residues, which may be used to study protein deep level structure and the formation of protein folding nucleus. In this work, we analyse the relationship of protein BD and sequences, and describe it by nonlinear functions estimated by support vector machines. We examine the functions by crossvalidation tests and find strong correlation between residue BD and local sequence environment. By further taking account the size of the molecule where a residue is located, we find that the correlation coefficient between predicted and observed depths improves from 0.60 to 0.65. Moreover, nearly half of the deepest 10% residues in a protein sequence can be correctly predicted. Our study suggests that a residue's burying extent is able to be predicted, to some degree, by itself and its local neighbouring residues. The methods used to estimate the sequence-depth functions are expected to become more useful in the investigation of protein structures and folding mechanism.
蛋白质埋藏深度(BD)是一种结构描述符,它不仅用于判断一个残基是暴露的还是埋藏的,还用于确定一个残基埋藏的深度。广泛使用的溶剂可及表面积主要侧重于蛋白质表面残基的研究,而蛋白质BD可以提供有关埋藏残基排列的更详细信息,这可用于研究蛋白质的深层结构和蛋白质折叠核的形成。在这项工作中,我们分析了蛋白质BD与序列之间的关系,并通过支持向量机估计的非线性函数对其进行描述。我们通过交叉验证测试检验这些函数,发现残基BD与局部序列环境之间存在很强的相关性。通过进一步考虑残基所在分子的大小,我们发现预测深度与观察深度之间的相关系数从0.60提高到了0.65。此外,蛋白质序列中最深的10%残基中近一半能够被正确预测。我们的研究表明,一个残基的埋藏程度在一定程度上能够由其自身及其局部相邻残基预测。用于估计序列-深度函数的方法有望在蛋白质结构和折叠机制的研究中变得更加有用。