Suppr超能文献

基于机器学习的蛋白质 p 值准确预测。

Basis for Accurate Protein p Prediction with Machine Learning.

机构信息

College of Computer Engineering, Jimei University, Xiamen 361021, China.

出版信息

J Chem Inf Model. 2023 May 22;63(10):2936-2947. doi: 10.1021/acs.jcim.3c00254. Epub 2023 May 5.

Abstract

pH regulates protein structures and the associated functions in many biological processes via protonation and deprotonation of ionizable side chains where the titration equilibria are determined by p's. To accelerate pH-dependent molecular mechanism research in the life sciences or industrial protein and drug designs, fast and accurate p prediction is crucial. Here we present a theoretical p data set PHMD549, which was successfully applied to four distinct machine learning methods, including DeepKa, which was proposed in our previous work. To reach a valid comparison, EXP67S was selected as the test set. Encouragingly, DeepKa was improved significantly and outperforms other state-of-the-art methods, except for the constant-pH molecular dynamics, which was utilized to create PHMD549. More importantly, DeepKa reproduced experimental p orders of acidic dyads in five enzyme catalytic sites. Apart from structural proteins, DeepKa was found applicable to intrinsically disordered peptides. Further, in combination with solvent exposures, it is revealed that DeepKa offers the most accurate prediction under the challenging circumstance that hydrogen bonding or salt bridge interaction is partly compensated by desolvation for a buried side chain. Finally, our benchmark data qualify PHMD549 and EXP67S as the basis for future developments of protein p prediction tools driven by artificial intelligence. In addition, DeepKa built on PHMD549 has been proven an efficient protein p predictor and thus can be applied immediately to, for example, p database construction, protein design, drug discovery, and so on.

摘要

pH 通过对可离子化侧链的质子化和去质子化来调节许多生物过程中的蛋白质结构和相关功能,其中滴定平衡由 p 值决定。为了加速生命科学或工业蛋白质和药物设计中依赖 pH 的分子机制研究,快速准确的 p 值预测至关重要。在这里,我们提出了一个理论 pH 值数据集 PHMD549,该数据集已成功应用于四种不同的机器学习方法,包括我们之前工作中提出的 DeepKa。为了进行有效的比较,选择 EXP67S 作为测试集。令人鼓舞的是,DeepKa 得到了显著改进,优于其他最先进的方法,除了常 pH 值分子动力学,该方法用于创建 PHMD549。更重要的是,DeepKa 再现了五个酶催化位点中酸性偶联物的实验 pH 值顺序。除了结构蛋白,DeepKa 还被发现适用于固有无序肽。此外,结合溶剂暴露情况,结果表明,在部分由去溶剂化补偿氢键或盐桥相互作用的埋置侧链的挑战性环境下,DeepKa 提供了最准确的预测。最后,我们的基准数据将 PHMD549 和 EXP67S 作为未来人工智能驱动的蛋白质 pH 值预测工具的基础。此外,基于 PHMD549 构建的 DeepKa 已被证明是一种有效的蛋白质 pH 值预测器,因此可以立即应用于 pH 值数据库构建、蛋白质设计、药物发现等领域。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验