Li Zhenhua, Wong Limsoon, Li Jinyan
Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, Singapore.
BMC Syst Biol. 2011 Jun 20;5 Suppl 1(Suppl 1):S5. doi: 10.1186/1752-0509-5-S1-S5.
A protein binding hot spot is a cluster of residues in the interface that are energetically important for the binding of the protein with its interaction partner. Identifying protein binding hot spots can give useful information to protein engineering and drug design, and can also deepen our understanding of protein-protein interaction. These residues are usually buried inside the interface with very low solvent accessible surface area (SASA). Thus SASA is widely used as an outstanding feature in hot spot prediction by many computational methods. However, SASA is not capable of distinguishing slightly buried residues, of which most are non hot spots, and deeply buried ones that are usually inside a hot spot.
We propose a new descriptor called "burial level" for characterizing residues, atoms and atomic contacts. Specifically, burial level captures the depth the residues are buried. We identify different kinds of deeply buried atomic contacts (DBAC) at different burial levels that are directly broken in alanine substitution. We use their numbers as input for SVM to classify between hot spot or non hot spot residues. We achieve F measure of 0.6237 under the leave-one-out cross-validation on a data set containing 258 mutations. This performance is better than other computational methods.
Our results show that hot spot residues tend to be deeply buried in the interface, not just having a low SASA value. This indicates that a high burial level is not only a necessary but also a more sufficient condition than a low SASA for a residue to be a hot spot residue. We find that those deeply buried atoms become increasingly more important when their burial levels rise up. This work also confirms the contribution of deeply buried interfacial atomic contacts to the energy of protein binding hot spot.
蛋白质结合热点是界面处的一组残基,对于蛋白质与其相互作用伙伴的结合在能量上至关重要。识别蛋白质结合热点可为蛋白质工程和药物设计提供有用信息,还能加深我们对蛋白质 - 蛋白质相互作用的理解。这些残基通常埋藏在界面内部,溶剂可及表面积(SASA)非常低。因此,SASA被许多计算方法广泛用作热点预测中的一个突出特征。然而,SASA无法区分轻度埋藏的残基(其中大多数不是热点)和通常位于热点内部的深度埋藏的残基。
我们提出了一种名为“埋藏水平”的新描述符,用于表征残基、原子和原子接触。具体而言,埋藏水平捕捉残基被埋藏的深度。我们在不同埋藏水平识别出不同类型的深度埋藏原子接触(DBAC),这些接触在丙氨酸取代中会直接断裂。我们将它们的数量用作支持向量机(SVM)的输入,以对热点或非热点残基进行分类。在包含258个突变的数据集上进行留一法交叉验证时,我们实现了0.6237的F值。此性能优于其他计算方法。
我们的结果表明,热点残基倾向于深度埋藏在界面中,而不仅仅是具有低SASA值。这表明高埋藏水平不仅是一个残基成为热点残基的必要条件,而且比低SASA更充分。我们发现,随着埋藏水平的升高,那些深度埋藏的原子变得越来越重要。这项工作还证实了深度埋藏的界面原子接触对蛋白质结合热点能量的贡献。