School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China.
Brief Bioinform. 2019 Jan 18;20(1):330-346. doi: 10.1093/bib/bbx126.
Intrinsically disordered proteins and regions are widely distributed in proteins, which are associated with many biological processes and diseases. Accurate prediction of intrinsically disordered proteins and regions is critical for both basic research (such as protein structure and function prediction) and practical applications (such as drug development). During the past decades, many computational approaches have been proposed, which have greatly facilitated the development of this important field. Therefore, a comprehensive and updated review is highly required. In this regard, we give a review on the computational methods for intrinsically disordered protein and region prediction, especially focusing on the recent development in this field. These computational approaches are divided into four categories based on their methodologies, including physicochemical-based method, machine-learning-based method, template-based method and meta method. Furthermore, their advantages and disadvantages are also discussed. The performance of 40 state-of-the-art predictors is directly compared on the target proteins in the task of disordered region prediction in the 10th Critical Assessment of protein Structure Prediction. A more comprehensive performance comparison of 45 different predictors is conducted based on seven widely used benchmark data sets. Finally, some open problems and perspectives are discussed.
无规卷曲蛋白质和区域广泛存在于蛋白质中,与许多生物过程和疾病相关。准确预测无规卷曲蛋白质和区域对于基础研究(如蛋白质结构和功能预测)和实际应用(如药物开发)都至关重要。在过去的几十年中,已经提出了许多计算方法,极大地促进了这一重要领域的发展。因此,非常需要进行全面和更新的综述。在这方面,我们综述了用于无规卷曲蛋白质和区域预测的计算方法,特别是重点介绍了该领域的最新进展。这些计算方法根据其方法学分为四类,包括基于物理化学的方法、基于机器学习的方法、基于模板的方法和元方法。此外,还讨论了它们的优缺点。在第 10 届蛋白质结构预测关键评估中的无序区域预测任务中,直接比较了 40 种最先进的预测器在目标蛋白质上的性能。基于七个广泛使用的基准数据集,对 45 种不同的预测器进行了更全面的性能比较。最后,讨论了一些开放性问题和展望。