Center for Condensed Matter Theory, Department of Physics, Indian Institute of Science, Bangalore 560012, India.
Undergraduate Program, Indian Institute of Science, Bangalore 560012, India.
J Chem Inf Model. 2021 Jan 25;61(1):106-114. doi: 10.1021/acs.jcim.0c01072. Epub 2020 Dec 15.
Double-stranded DNA (dsDNA) has been established as an efficient medium for charge migration, bringing it to the forefront of the field of molecular electronics and biological research. The charge migration rate is controlled by the electronic couplings between the two nucleobases of DNA/RNA. These electronic couplings strongly depend on the intermolecular geometry and orientation. Estimating these electronic couplings for all the possible relative geometries of molecules using the computationally demanding first-principles calculations requires a lot of time and computational resources. In this article, we present a machine learning (ML)-based model to calculate the electronic coupling between any two bases of dsDNA/dsRNA and bypass the computationally expensive first-principles calculations. Using the Coulomb matrix representation which encodes the atomic identities and coordinates of the DNA base pairs to prepare the input dataset, we train a feedforward neural network model. Our neural network (NN) model can predict the electronic couplings between dsDNA base pairs with any structural orientation with a mean absolute error (MAE) of less than 0.014 eV. We further use the NN-predicted electronic coupling values to compute the dsDNA/dsRNA conductance.
双链 DNA(dsDNA)已被确立为一种有效的电荷迁移介质,使其成为分子电子学和生物学研究领域的前沿。电荷迁移率受 DNA/RNA 两个核碱基之间的电子耦合控制。这些电子耦合强烈依赖于分子间的几何形状和取向。使用计算要求很高的第一性原理计算来估计所有可能的分子相对几何形状的这些电子耦合需要大量的时间和计算资源。在本文中,我们提出了一种基于机器学习(ML)的模型,用于计算 dsDNA/dsRNA 中任意两个碱基之间的电子耦合,从而绕过计算成本高昂的第一性原理计算。我们使用库仑矩阵表示法来编码 DNA 碱基对的原子身份和坐标,以准备输入数据集,然后对前馈神经网络模型进行训练。我们的神经网络(NN)模型可以预测任何结构取向的 dsDNA 碱基对之间的电子耦合,平均绝对误差(MAE)小于 0.014 eV。我们进一步使用 NN 预测的电子耦合值来计算 dsDNA/dsRNA 的电导率。