Schillinger Christian, Boisguerin Prisca, Krause Gerd
Leibniz Institute for Molecular Pharmacology, Robert-Roessle-Strasse 10, Berlin, FU-Berlin, Germany.
Bioinformatics. 2009 Jul 1;25(13):1632-9. doi: 10.1093/bioinformatics/btp264. Epub 2009 Apr 17.
The flow of information within cellular pathways largely relies on specific protein-protein interactions. Discovering such interactions that are mostly mediated by peptide recognition modules (PRM) is therefore a fundamental step towards unravelling the complexity of varying pathways. Since peptides can be recognized by more than one PRM and high-throughput experiments are both time consuming and expensive, it would be preferable to narrow down all potential peptide ligands for one specific PRM by a computational method. We at first present Domain Interaction Footprint (DIF) a new approach to predict binding peptides to PRMs merely based on the sequence of the peptides. Second, we show that our method is able to create a multi-classification model that assesses the binding specificity of a given peptide to all examined PRMs at once.
We first applied our approach to a previously investigated dataset of different SH3 domains and predicted their appropriate peptide ligands with an exceptionally high accuracy. This result outperforms all recent methods trained on the same dataset. Furthermore, we used our technique to build two multi-classification models (SH3 and PDZ domains) to predict the interaction preference between a peptide and every single domain in the corresponding domain family at once. Predicting the domain specificity most reliably, our proposed approach can be seen as a first step towards a complete multi-domain classification model comprised of all domains of one family. Such a comprehensive domain specificity model would benefit the quest for highly specific peptide ligands interacting solely with the domain of choice.
Supplementary data are available at Bioinformatics online.
细胞通路中的信息流很大程度上依赖于特定的蛋白质-蛋白质相互作用。因此,发现大多由肽识别模块(PRM)介导的此类相互作用是解开各种通路复杂性的基本步骤。由于肽可以被不止一个PRM识别,并且高通量实验既耗时又昂贵,因此通过计算方法缩小针对一种特定PRM的所有潜在肽配体的范围会更可取。我们首先提出了结构域相互作用足迹(DIF),这是一种仅基于肽序列预测与PRM结合肽的新方法。其次,我们表明我们的方法能够创建一个多分类模型,该模型可以同时评估给定肽与所有检测到的PRM的结合特异性。
我们首先将我们的方法应用于先前研究的不同SH3结构域的数据集,并以极高的准确率预测了它们合适的肽配体。这一结果优于在同一数据集上训练的所有最新方法。此外,我们使用我们的技术构建了两个多分类模型(SH3和PDZ结构域),以同时预测肽与相应结构域家族中每个单个结构域之间的相互作用偏好。我们提出的方法能够最可靠地预测结构域特异性,可以看作是朝着由一个家族的所有结构域组成的完整多结构域分类模型迈出的第一步。这样一个全面的结构域特异性模型将有助于寻找仅与所选结构域相互作用的高特异性肽配体。
补充数据可在《生物信息学》在线获取。