Suppr超能文献

机器学习在数秒内构建全量子力学精度的蛋白质力场。

Machine learning builds full-QM precision protein force fields in seconds.

机构信息

Shanghai Jiao Tong University, China.

Shanghai First Maternity and Infant Hospital, Tongji University School of Medicine, Shanghai, China.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab158.

Abstract

Full-quantum mechanics (QM) calculations are extraordinarily precise but difficult to apply to large systems, such as biomolecules. Motivated by the massive demand for efficient calculations for large systems at the full-QM level and by the significant advances in machine learning, we have designed a neural network-based two-body molecular fractionation with conjugate caps (NN-TMFCC) approach to accelerate the energy and atomic force calculations of proteins. The results show very high precision for the proposed NN potential energy surface models of residue-based fragments, with energy root-mean-squared errors (RMSEs) less than 1.0 kcal/mol and force RMSEs less than 1.3 kcal/mol/Å for both training and testing sets. The proposed NN-TMFCC method calculates the energies and atomic forces of 15 representative proteins with full-QM precision in 10-100 s, which is thousands of times faster than the full-QM calculations. The computational complexity of the NN-TMFCC method is independent of the protein size and only depends on the number of residue species, which makes this method particularly suitable for rapid prediction of large systems with tens of thousands or even hundreds of thousands of times acceleration. This highly precise and efficient NN-TMFCC approach exhibits considerable potential for performing energy and force calculations, structure predictions and molecular dynamics simulations of proteins with full-QM precision.

摘要

全量子力学(QM)计算非常精确,但难以应用于大型系统,如生物分子。受对大型系统进行全 QM 级高效计算的巨大需求以及机器学习的显著进展的推动,我们设计了一种基于神经网络的二体分子分馏与共轭帽(NN-TMFCC)方法,以加速蛋白质的能量和原子力计算。结果表明,所提出的基于残基片段的 NN 位能表面模型具有非常高的精度,对于训练集和测试集,能量均方根误差(RMSE)小于 1.0 kcal/mol,力 RMSE 小于 1.3 kcal/mol/Å。所提出的 NN-TMFCC 方法以全 QM 精度计算了 15 个代表性蛋白质的能量和原子力,计算时间为 10-100 s,比全 QM 计算快数千倍。NN-TMFCC 方法的计算复杂度与蛋白质大小无关,仅取决于残基种类的数量,这使得该方法特别适合于快速预测具有数万甚至数十万倍加速的大型系统。这种高度精确和高效的 NN-TMFCC 方法在执行具有全 QM 精度的蛋白质能量和力计算、结构预测和分子动力学模拟方面具有相当大的潜力。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验