Li Jinyan, Liu Qian
Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, Singapore 639798.
Bioinformatics. 2009 Mar 15;25(6):743-50. doi: 10.1093/bioinformatics/btp058. Epub 2009 Jan 29.
The O-ring theory reveals that the binding hot spot at a protein interface is surrounded by a ring of residues that are energetically less important than the residues in the hot spot. As this ring of residues is served to occlude water molecules from the hot spot, the O-ring theory is also called 'water exclusion' hypothesis. We propose a 'double water exclusion' hypothesis to refine the O-ring theory by assuming the hot spot itself is water-free. To computationally model a water-free hot spot, we use a biclique pattern that is defined as two maximal groups of residues from two chains in a protein complex holding the property that every residue contacts with all residues in the other group.
Given a chain pair A and B of a protein complex from the Protein Data Bank (PDB), we calculate the interatomic distance of all possible pairs of atoms between A and B. We then represent A and B as a bipartite graph based on these distance information. Maximal biclique subgraphs are subsequently identified from all of the bipartite graphs to locate biclique patterns at the interfaces. We address two properties of biclique patterns: a non-redundant occurrence in PDB, and a correspondence with hot spots when the solvent-accessible surface area (SASA) of a biclique pattern in the complex form is small. A total of 1293 biclique patterns are discovered which have a non-redundant occurrence of at least five, and which each have a minimum two and four residues at the two sides. Through extensive queries to the HotSprint and ASEdb databases, we verified that biclique patterns are rich of true hot residues. Our algorithm and results provide a new way to identify hot spots by examining proteins' structural data.
The biclique mining algorithm is available at http://www.ntu.edu.sg/home/jyli/dwe.html.
Supplementary data are available at Bioinformatics online.
O 环理论表明,蛋白质界面处的结合热点被一圈残基环绕,这些残基在能量上不如热点中的残基重要。由于这圈残基的作用是将水分子阻挡在热点之外,O 环理论也被称为“水排除”假说。我们提出了“双水排除”假说,通过假设热点本身无水来完善 O 环理论。为了通过计算对无水热点进行建模,我们使用了一种双团模式,它被定义为蛋白质复合物中两条链上的两个最大残基组,具有每个残基都与另一组中的所有残基接触的特性。
给定来自蛋白质数据库(PDB)的蛋白质复合物的链对 A 和 B,我们计算 A 和 B 之间所有可能原子对的原子间距离。然后基于这些距离信息将 A 和 B 表示为二分图。随后从所有二分图中识别出最大双团子图,以定位界面处的双团模式。我们研究了双团模式的两个特性:在 PDB 中的非冗余出现,以及当复合物形式下双团模式的溶剂可及表面积(SASA)较小时与热点的对应关系。总共发现了 1293 个双团模式,它们至少有五次非冗余出现,并且在两侧分别至少有两个和四个残基。通过对 HotSprint 和 ASEdb 数据库的广泛查询,我们验证了双团模式富含真正的热点残基。我们的算法和结果为通过检查蛋白质的结构数据来识别热点提供了一种新方法。
双团挖掘算法可在 http://www.ntu.edu.sg/home/jyli/dwe.html 获得。
补充数据可在《生物信息学》在线获取。