Suppr超能文献

通过遗传算法和神经网络发现分子转变的集体变量。

Discovering Collective Variables of Molecular Transitions via Genetic Algorithms and Neural Networks.

机构信息

Van 't Hoff Institute for Molecular Sciences, AI4Science Laboratory, and Amsterdam Center for Multiscale Modeling, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands.

出版信息

J Chem Theory Comput. 2021 Apr 13;17(4):2294-2306. doi: 10.1021/acs.jctc.0c00981. Epub 2021 Mar 4.

Abstract

With the continual improvement of computing hardware and algorithms, simulations have become a powerful tool for understanding all sorts of (bio)molecular processes. To handle the large simulation data sets and to accelerate slow, activated transitions, a condensed set of descriptors, or collective variables (CVs), is needed to discern the relevant dynamics that describes the molecular process of interest. However, proposing an adequate set of CVs that can capture the intrinsic reaction coordinate of the molecular transition is often extremely difficult. Here, we present a framework to find an optimal set of CVs from a pool of candidates using a combination of artificial neural networks and genetic algorithms. The approach effectively replaces the encoder of an autoencoder network with genes to represent the latent space, i.e., the CVs. Given a selection of CVs as input, the network is trained to recover the atom coordinates underlying the CV values at points along the transition. The network performance is used as an estimator of the fitness of the input CVs. Two genetic algorithms optimize the CV selection and the neural network architecture. The successful retrieval of optimal CVs by this framework is illustrated at the hand of two case studies: the well-known conformational change in the alanine dipeptide molecule and the more intricate transition of a base pair in B-DNA from the classic Watson-Crick pairing to the alternative Hoogsteen pairing. Key advantages of our framework include the following: optimal interpretable CVs, avoiding costly calculation of committor or time-correlation functions, and automatic hyperparameter optimization. In addition, we show that applying a time-delay between the network input and output allows for enhanced selection of slow variables. Moreover, the network can also be used to generate molecular configurations of unexplored microstates, for example, for augmentation of the simulation data.

摘要

随着计算硬件和算法的不断改进,模拟已成为理解各种(生物)分子过程的强大工具。为了处理大型模拟数据集并加速缓慢的激活跃迁,需要一组浓缩的描述符或集体变量(CVs),以辨别描述感兴趣的分子过程的相关动力学。然而,提出能够捕捉分子跃迁内在反应坐标的合适 CV 集通常非常困难。在这里,我们提出了一种使用人工神经网络和遗传算法组合从候选者中找到最佳 CV 集的框架。该方法有效地用基因替换自动编码器网络的编码器来表示潜在空间,即 CVs。给定一组 CV 作为输入,网络被训练以恢复 CV 值沿跃迁的点处原子坐标。网络性能用作输入 CV 适应度的估计器。两种遗传算法优化 CV 选择和神经网络架构。通过这个框架成功检索到最佳 CV 来说明两个案例研究:丙氨酸二肽分子中众所周知的构象变化和 B-DNA 中碱基对从经典的 Watson-Crick 配对到替代的 Hoogsteen 配对的更复杂的转变。我们框架的主要优点包括以下几点:最佳可解释的 CVs,避免计算关键函数或时间相关函数的成本,以及自动超参数优化。此外,我们还表明,在网络输入和输出之间施加时间延迟可以增强对慢变量的选择。此外,该网络还可用于生成未探索微观状态的分子构型,例如,用于扩充模拟数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/36ce/8047796/429682a9ff24/ct0c00981_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验