Jain Shobhit, Bader Gary D
Department of Computer Science and The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
Bioinformatics. 2016 Jun 15;32(12):1865-72. doi: 10.1093/bioinformatics/btw045. Epub 2016 Feb 9.
Many intracellular signaling processes are mediated by interactions involving peptide recognition modules such as SH3 domains. These domains bind to small, linear protein sequence motifs which can be identified using high-throughput experimental screens such as phage display. Binding motif patterns can then be used to computationally predict protein interactions mediated by these domains. While many protein-protein interaction prediction methods exist, most do not work with peptide recognition module mediated interactions or do not consider many of the known constraints governing physiologically relevant interactions between two proteins.
A novel method for predicting physiologically relevant SH3 domain-peptide mediated protein-protein interactions in S. cerevisae using phage display data is presented. Like some previous similar methods, this method uses position weight matrix models of protein linear motif preference for individual SH3 domains to scan the proteome for potential hits and then filters these hits using a range of evidence sources related to sequence-based and cellular constraints on protein interactions. The novelty of this approach is the large number of evidence sources used and the method of combination of sequence based and protein pair based evidence sources. By combining different peptide and protein features using multiple Bayesian models we are able to predict high confidence interactions with an overall accuracy of 0.97.
Domain-Motif Mediated Interaction Prediction (DoMo-Pred) command line tool and all relevant datasets are available under GNU LGPL license for download from http://www.baderlab.org/Software/DoMo-Pred The DoMo-Pred command line tool is implemented using Python 2.7 and C ++.
Supplementary data are available at Bioinformatics online.
许多细胞内信号传导过程是由涉及肽识别模块(如SH3结构域)的相互作用介导的。这些结构域与小的线性蛋白质序列基序结合,这些基序可以通过高通量实验筛选(如噬菌体展示)来识别。然后,结合基序模式可用于计算预测由这些结构域介导的蛋白质相互作用。虽然存在许多蛋白质-蛋白质相互作用预测方法,但大多数方法不适用于肽识别模块介导的相互作用,或者没有考虑到许多控制两种蛋白质之间生理相关相互作用的已知限制因素。
提出了一种利用噬菌体展示数据预测酿酒酵母中生理相关的SH3结构域-肽介导的蛋白质-蛋白质相互作用的新方法。与之前的一些类似方法一样,该方法使用针对单个SH3结构域的蛋白质线性基序偏好的位置权重矩阵模型来扫描蛋白质组以寻找潜在的命中结果,然后使用一系列与基于序列和细胞对蛋白质相互作用的限制相关的证据来源对这些命中结果进行筛选。这种方法的新颖之处在于使用了大量的证据来源以及基于序列和基于蛋白质对的证据来源的组合方法。通过使用多个贝叶斯模型组合不同的肽和蛋白质特征,我们能够以0.97的总体准确率预测高置信度的相互作用。
结构域-基序介导的相互作用预测(DoMo-Pred)命令行工具和所有相关数据集可在GNU LGPL许可下从http://www.baderlab.org/Software/DoMo-Pred下载。DoMo-Pred命令行工具是使用Python 2.7和C++实现的。
补充数据可在《生物信息学》在线获取。