Suppr超能文献

机器学习时代的集体变量发现:现实、炒作及其中的一切。

Collective variable discovery in the age of machine learning: reality, hype and everything in between.

作者信息

Bhakat Soumendranath

机构信息

Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania Pennsylvania 19104-6059 USA

出版信息

RSC Adv. 2022 Sep 2;12(38):25010-25024. doi: 10.1039/d2ra03660f. eCollection 2022 Aug 30.

Abstract

Understanding the kinetics and thermodynamics profile of biomolecules is necessary to understand their functional roles which has a major impact in mechanism driven drug discovery. Molecular dynamics simulation has been routinely used to understand conformational dynamics and molecular recognition in biomolecules. Statistical analysis of high-dimensional spatiotemporal data generated from molecular dynamics simulation requires identification of a few low-dimensional variables which can describe the essential dynamics of a system without significant loss of information. In physical chemistry, these low-dimensional variables are often called collective variables. Collective variables are used to generate reduced representations of free energy surfaces and calculate transition probabilities between different metastable basins. However the choice of collective variables is not trivial for complex systems. Collective variables range from geometric criteria such as distances and dihedral angles to abstract ones such as weighted linear combinations of multiple geometric variables. The advent of machine learning algorithms led to increasing use of abstract collective variables to represent biomolecular dynamics. In this review, I will highlight several nuances of commonly used collective variables ranging from geometric to abstract ones. Further, I will put forward some cases where machine learning based collective variables were used to describe simple systems which in principle could have been described by geometric ones. Finally, I will put forward my thoughts on artificial general intelligence and how it can be used to discover and predict collective variables from spatiotemporal data generated by molecular dynamics simulations.

摘要

了解生物分子的动力学和热力学概况对于理解其功能作用至关重要,这对基于机制的药物发现具有重大影响。分子动力学模拟经常被用于理解生物分子中的构象动力学和分子识别。对分子动力学模拟产生的高维时空数据进行统计分析,需要识别一些低维变量,这些变量能够描述系统的基本动力学而不会显著损失信息。在物理化学中,这些低维变量通常被称为集体变量。集体变量用于生成自由能表面的简化表示,并计算不同亚稳盆地之间的跃迁概率。然而,对于复杂系统而言,集体变量的选择并非易事。集体变量的范围从诸如距离和二面角等几何标准到诸如多个几何变量的加权线性组合等抽象标准。机器学习算法的出现导致越来越多地使用抽象集体变量来表示生物分子动力学。在这篇综述中,我将强调从几何集体变量到抽象集体变量等常用集体变量的几个细微差别。此外,我将提出一些案例,其中基于机器学习的集体变量被用于描述原则上可以由几何集体变量描述的简单系统。最后,我将阐述我对通用人工智能的看法,以及它如何用于从分子动力学模拟产生的时空数据中发现和预测集体变量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4fd/9437778/f45ba13ba64d/d2ra03660f-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验