Kamisetty Hetunandan, Ghosh Bornika, Langmead Christopher James, Bailey-Kellogg Chris
Department of Biochemistry, University of Washington.
Department of Computer Science, Dartmouth.
Res Comput Mol Biol. 2014;8394:129-143. doi: 10.1007/978-3-319-05269-4_10.
In studying the strength and specificity of interaction between members of two protein families, key questions center on pairs of possible partners actually interact, they interact, and they interact while others do not. The advent of large-scale experimental studies of interactions between members of a target family and a diverse set of possible interaction partners offers the opportunity to address these questions. We develop here a method, DgSpi (Data-driven Graphical models of Specificity in Protein:protein Interactions), for learning and using graphical models that explicitly represent the amino acid basis for interaction specificity () and extend earlier classification-oriented approaches () to predict the Δ of binding (). We demonstrate the effectiveness of our approach in analyzing and predicting interactions between a set of 82 PDZ recognition modules, against a panel of 217 possible peptide partners, based on data from MacBeath and colleagues. Our predicted Δ values are highly predictive of the experimentally measured ones, reaching correlation coefficients of 0.69 in 10-fold cross-validation and 0.63 in leave-one-PDZ-out cross-validation. Furthermore, the model serves as a compact representation of amino acid constraints underlying the interactions, enabling protein-level Δ predictions to be naturally understood in terms of residue-level constraints. Finally, as a generative model, DgSpi readily enables the design of new interacting partners, and we demonstrate that designed ligands are novel and diverse.
在研究两个蛋白质家族成员之间相互作用的强度和特异性时,关键问题集中在可能的相互作用伙伴对是否实际相互作用、它们如何相互作用以及为何它们相互作用而其他伙伴对不相互作用。对目标家族成员与各种可能的相互作用伙伴之间相互作用进行大规模实验研究的出现,为解决这些问题提供了机会。我们在此开发了一种方法,即DgSpi(蛋白质-蛋白质相互作用特异性的数据驱动图形模型),用于学习和使用明确表示相互作用特异性氨基酸基础的图形模型,并扩展早期基于分类的方法来预测结合的Δ值。基于MacBeath及其同事的数据,我们证明了我们的方法在分析和预测82个PDZ识别模块与217个可能的肽伙伴之间相互作用时的有效性。我们预测的Δ值对实验测量值具有高度预测性,在10折交叉验证中相关系数达到0.69,在留一-PDZ-out交叉验证中达到0.63。此外,该模型可紧凑表示相互作用背后的氨基酸限制,从而使蛋白质水平的Δ预测能够根据残基水平的限制自然地得到理解。最后,作为一种生成模型,DgSpi能够轻松设计新的相互作用伙伴,并且我们证明设计的配体是新颖且多样的。