Suppr超能文献

基于序列的蛋白质结合残基预测因子的回顾与比较评估。

Review and comparative assessment of sequence-based predictors of protein-binding residues.

机构信息

School of Computer and Information Technology, Xinyang Normal University.

Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.

出版信息

Brief Bioinform. 2018 Sep 28;19(5):821-837. doi: 10.1093/bib/bbx022.

Abstract

Understanding of molecular mechanisms that govern protein-protein interactions and accurate modeling of protein-protein docking rely on accurate identification and prediction of protein-binding partners and protein-binding residues. We review over 40 methods that predict protein-protein interactions from protein sequences including methods that predict interacting protein pairs, protein-binding residues for a pair of interacting sequences and protein-binding residues in a single protein chain. We focus on the latter methods that provide residue-level annotations and that can be broadly applied to all protein sequences. We compare their architectures, inputs and outputs, and we discuss aspects related to their assessment and availability. We also perform first-of-its-kind comprehensive empirical comparison of representative predictors of protein-binding residues using a novel and high-quality benchmark data set. We show that the selected predictors accurately discriminate protein-binding and non-binding residues and that newer methods outperform older designs. However, these methods are unable to accurately separate residues that bind other molecules, such as DNA, RNA and small ligands, from the protein-binding residues. This cross-prediction, defined as the incorrect prediction of nucleic-acid- and small-ligand-binding residues as protein binding, is substantial for all evaluated methods and is not driven by the proximity to the native protein-binding residues. We discuss reasons for this drawback and we offer several recommendations. In particular, we postulate the need for a new generation of more accurate predictors and data sets, inclusion of a comprehensive assessment of the cross-predictions in future studies and higher standards of availability of the published methods.

摘要

我们综述了 40 多种从蛋白质序列预测蛋白质-蛋白质相互作用的方法,包括预测相互作用的蛋白质对、相互作用序列的蛋白质结合残基以及单个蛋白质链中的蛋白质结合残基的方法。我们专注于提供残基级注释并且可以广泛应用于所有蛋白质序列的后一种方法。我们比较了它们的架构、输入和输出,并讨论了与其评估和可用性相关的方面。我们还首次使用新颖且高质量的基准数据集对蛋白质结合残基的代表性预测因子进行了全面的实证比较。我们表明,所选的预测因子能够准确区分蛋白质结合和非结合残基,并且较新的方法优于较旧的设计。然而,这些方法无法准确区分与其他分子(如 DNA、RNA 和小分子配体)结合的残基与蛋白质结合残基。这种交叉预测,定义为将核酸和小分子配体结合残基错误地预测为蛋白质结合,对于所有评估的方法都是重要的,并且不是由与天然蛋白质结合残基的接近程度驱动的。我们讨论了这种缺点的原因,并提出了一些建议。特别是,我们假设需要新一代更准确的预测因子和数据集,在未来的研究中全面评估交叉预测,并提高已发表方法的可用性标准。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验