Department of Biochemistry, Siebens-Drake Medical Research Institute, Schulich School of Medicine and Dentistry, The University of Western Ontario, London, Ontario, Canada.
PLoS One. 2011;6(10):e25528. doi: 10.1371/journal.pone.0025528. Epub 2011 Oct 7.
Protein-protein interactions (PPIs) are frequently mediated by the binding of a modular domain in one protein to a short, linear peptide motif in its partner. The advent of proteomic methods such as peptide and protein arrays has led to the accumulation of a wealth of interaction data for modular interaction domains. Although several computational programs have been developed to predict modular domain-mediated PPI events, they are often restricted to a given domain type. We describe DomPep, a method that can potentially be used to predict PPIs mediated by any modular domains. DomPep combines proteomic data with sequence information to achieve high accuracy and high coverage in PPI prediction. Proteomic binding data were employed to determine a simple yet novel parameter Ligand-Binding Similarity which, in turn, is used to calibrate Domain Sequence Identity and Position-Weighted-Matrix distance, two parameters that are used in constructing prediction models. Moreover, DomPep can be used to predict PPIs for both domains with experimental binding data and those without. Using the PDZ and SH2 domain families as test cases, we show that DomPep can predict PPIs with accuracies superior to existing methods. To evaluate DomPep as a discovery tool, we deployed DomPep to identify interactions mediated by three human PDZ domains. Subsequent in-solution binding assays validated the high accuracy of DomPep in predicting authentic PPIs at the proteome scale. Because DomPep makes use of only interaction data and the primary sequence of a domain, it can be readily expanded to include other types of modular domains.
蛋白质-蛋白质相互作用 (PPIs) 通常是由一个蛋白质中的模块化结构域与另一个蛋白质中的短线性肽基序结合介导的。肽和蛋白质阵列等蛋白质组学方法的出现,导致了模块化相互作用结构域的大量相互作用数据的积累。尽管已经开发了几种计算程序来预测模块化结构域介导的 PPI 事件,但它们通常仅限于特定的结构域类型。我们描述了 DomPep,这是一种可用于预测任何模块化结构域介导的 PPI 的方法。DomPep 将蛋白质组学数据与序列信息相结合,以实现 PPI 预测的高精度和高覆盖率。蛋白质组学结合数据用于确定一种简单而新颖的参数配体结合相似性,反过来又用于校准结构域序列同一性和位置加权矩阵距离,这两个参数用于构建预测模型。此外,DomPep 可用于预测具有实验结合数据和无实验结合数据的结构域的 PPI。我们使用 PDZ 和 SH2 结构域家族作为测试案例,表明 DomPep 可以预测具有优于现有方法的精度的 PPI。为了评估 DomPep 作为发现工具的性能,我们将 DomPep 用于识别三个人类 PDZ 结构域介导的相互作用。随后的溶液结合测定验证了 DomPep 在预测蛋白质组范围内真实 PPI 的高准确性。由于 DomPep 仅使用相互作用数据和结构域的一级序列,因此可以很容易地扩展到包括其他类型的模块化结构域。