Aumentado-Armstrong Tristan T, Istrate Bogdan, Murgita Robert A
Department of Anatomy and Cell Biology, McGill University, Montreal, Canada.
School of Computer Science, McGill University, Montreal, Canada.
Algorithms Mol Biol. 2015 Feb 15;10:7. doi: 10.1186/s13015-015-0033-9. eCollection 2015.
Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
蛋白质表面的相互作用位点几乎介导了所有的生物活性,对其进行识别有望推动疾病治疗和药物设计的发展。预测这些位点的新型算法已迅速涌现,并且该领域在过去十年中取得了显著进展。然而,目前尚未对最新方法进行系统全面的综述。在此,我们阐述现代蛋白质 - 蛋白质相互作用位点(PPIS)预测所需的生物学理论、数据集和特征的复杂性,并对当前最先进的算法及其性能进行综合分析。首先,回顾预测器使用的主要数据来源,包括训练集、评估集及其获取方法。然后,探究所采用的特征及其在PPIS生物学特征描述中的重要性。接着讨论当代预测程序中采用的方法,以及它们在最近用于评估的数据集上的相对性能。此外,还描述了PPIS识别在合理药物设计、热点预测和计算分子对接方面的潜在效用。最后,对该领域未来最有前景的发展领域进行分析。