Littler Stephen J, Hubbard Simon J
Faculty of Life Sciences, The University of Manchester, Jackson's Mill, P.O. Box 88, Manchester M60 1QD, UK.
J Mol Biol. 2005 Feb 4;345(5):1265-79. doi: 10.1016/j.jmb.2004.11.011. Epub 2004 Dec 16.
The repertoire of naturally occurring protein structures is usually characterised in structural terms at the domain level by their constituent folds. As structure is acknowledged to be an important stepping stone to the understanding of protein function, an appreciation of how individual domain interactions are built to form complete, functional protein structures is essential. A comprehensive study of protein domain interactions has been undertaken, covering all those observed in known structures, as well as those predicted to occur in 46 completed genome sequences from all three domains of life. In particular, we examine the promiscuity of protein domains characterised by SCOP superfamilies in terms of their interacting partners, the surface they use to form these interactions, and the relative orientations of their domain partners. Protein domains are shown to display a variety of behaviours, ranging from high promiscuity to absolute monogamy of domain surface employed, with both multiple and single domain partners. In addition, the conservation of sequence and volume at domain interface surfaces is observed to be significantly higher than at accessible surface in general, acting as a powerful potential predictor for domain interactions. We also examine the separation of interacting domains in protein sequence, showing that standard thresholds of 30 amino acid residues lead to a significant false positive rate, and an even more significant false negative rate of approximately 40%. These data suggest that there may be many more than the 2000 domain--domain interactions that have not yet been observed structurally, and we provide a top 30 hit-list of putative domain interactions which should be targeted.
天然存在的蛋白质结构库通常在结构层面上以其组成折叠在结构域水平进行表征。由于结构被认为是理解蛋白质功能的重要基石,因此了解单个结构域如何相互作用以形成完整的功能性蛋白质结构至关重要。我们开展了一项关于蛋白质结构域相互作用的全面研究,涵盖了已知结构中观察到的所有相互作用,以及来自生命三个域的46个已完成基因组序列中预测会发生的相互作用。特别是,我们从相互作用伙伴、形成这些相互作用所使用的表面以及其结构域伙伴的相对方向等方面,研究了以SCOP超家族为特征的蛋白质结构域的混杂性。结果表明,蛋白质结构域表现出多种行为,从高度混杂到所使用结构域表面的绝对单一性,涉及多个和单个结构域伙伴。此外,观察到结构域界面表面的序列和体积保守性总体上显著高于可及表面,这可作为结构域相互作用的有力潜在预测指标。我们还研究了蛋白质序列中相互作用结构域的间隔,结果表明30个氨基酸残基的标准阈值会导致显著的假阳性率,以及约40%的更显著的假阴性率。这些数据表明,尚未在结构上观察到的结构域 - 结构域相互作用可能远不止2000种,我们提供了一份应作为目标的前30种假定结构域相互作用的清单。