Suppr超能文献

IntPred:一种基于结构的蛋白质-蛋白质相互作用位点预测工具。

IntPred: a structure-based predictor of protein-protein interaction sites.

作者信息

Northey Thomas C, Barešić Anja, Martin Andrew C R

机构信息

Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK.

出版信息

Bioinformatics. 2018 Jan 15;34(2):223-229. doi: 10.1093/bioinformatics/btx585.

Abstract

MOTIVATION

Protein-protein interactions are vital for protein function with the average protein having between three and ten interacting partners. Knowledge of precise protein-protein interfaces comes from crystal structures deposited in the Protein Data Bank (PDB), but only 50% of structures in the PDB are complexes. There is therefore a need to predict protein-protein interfaces in silico and various methods for this purpose. Here we explore the use of a predictor based on structural features and which exploits random forest machine learning, comparing its performance with a number of popular established methods.

RESULTS

On an independent test set of obligate and transient complexes, our IntPred predictor performs well (MCC = 0.370, ACC = 0.811, SPEC = 0.916, SENS = 0.411) and compares favourably with other methods. Overall, IntPred ranks second of six methods tested with SPPIDER having slightly better overall performance (MCC = 0.410, ACC = 0.759, SPEC = 0.783, SENS = 0.676), but considerably worse specificity than IntPred. As with SPPIDER, using an independent test set of obligate complexes enhanced performance (MCC = 0.381) while performance is somewhat reduced on a dataset of transient complexes (MCC = 0.303). The trade-off between sensitivity and specificity compared with SPPIDER suggests that the choice of the appropriate tool is application-dependent.

AVAILABILITY AND IMPLEMENTATION

IntPred is implemented in Perl and may be downloaded for local use or run via a web server at www.bioinf.org.uk/intpred/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质 - 蛋白质相互作用对于蛋白质功能至关重要,平均每个蛋白质有三到十个相互作用伙伴。精确的蛋白质 - 蛋白质界面信息来自于蛋白质数据库(PDB)中 deposited 的晶体结构,但PDB中只有50%的结构是复合物。因此,需要在计算机上预测蛋白质 - 蛋白质界面,并为此开发了各种方法。在这里,我们探索使用一种基于结构特征并利用随机森林机器学习的预测器,并将其性能与一些流行的既定方法进行比较。

结果

在一个由专性和瞬时复合物组成的独立测试集上,我们的IntPred预测器表现良好(MCC = 0.370,ACC = 0.811,SPEC = 0.916,SENS = 0.411),与其他方法相比具有优势。总体而言,IntPred在六种测试方法中排名第二,SPPIDER的整体性能略好(MCC = 0.410,ACC = 0.759,SPEC = 0.783,SENS = 0.676),但其特异性比IntPred差得多。与SPPIDER一样,使用专性复合物的独立测试集可提高性能(MCC = 0.381),而在瞬时复合物数据集上性能会有所降低(MCC = 0.303)。与SPPIDER相比,敏感性和特异性之间的权衡表明,合适工具的选择取决于应用。

可用性和实现

IntPred用Perl实现,可以下载供本地使用,也可以通过网络服务器(www.bioinf.org.uk/intpred/)运行。

补充信息

补充数据可在《生物信息学》在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验