Ushenin Konstantin, Khrabrov Kuzma, Tsypin Artem, Ber Anton, Rumiantsev Egor, Kadurin Artur
AIRI, Kutuzovskiy Prospect, Moscow, 121170, Russian Federation.
Ural Federal University, Mira st., Yekaterinburg, 620002, Russian Federation.
J Cheminform. 2025 Apr 29;17(1):65. doi: 10.1186/s13321-025-01010-7.
The electron density is an important object in quantum chemistry that is crucial for many downstream tasks in drug design. Recent deep learning approaches predict the electron density around a molecule from atom types and atom positions. Most of these methods use the plane wave (PW) numerical method as a source of ground-truth training data. However, the drug design field mostly uses the Linear Combination of Atomic Orbitals (LCAO) for computation of quantum properties. In this study, we focus on prediction of the electron density for drug-like substances and training neural networks with LCAO-based datasets. Our experiments show that proper handling of large amplitudes of core orbitals is crucial for training on LCAO-based data. We propose to store the electron density with the standard grids instead of the uniform grid. This allowed us to reduce the number of probing points per molecule by 43 times and reduce storage space requirements by 8 times. Finally, we propose a novel architecture based on the DeepDFT model that we name LAGNet. It is specifically designed and tuned for drug-like substances and DFT dataset.
电子密度是量子化学中的一个重要对象,对于药物设计中的许多下游任务至关重要。最近的深度学习方法根据原子类型和原子位置预测分子周围的电子密度。这些方法大多使用平面波(PW)数值方法作为真实训练数据的来源。然而,药物设计领域大多使用原子轨道线性组合(LCAO)来计算量子性质。在本研究中,我们专注于预测类药物物质的电子密度,并使用基于LCAO的数据集训练神经网络。我们的实验表明,正确处理核心轨道的大幅度对于基于LCAO的数据训练至关重要。我们建议用标准网格而不是均匀网格来存储电子密度。这使我们能够将每个分子的探测点数减少43倍,并将存储空间需求减少8倍。最后,我们基于深度密度泛函理论(DeepDFT)模型提出了一种新颖的架构,我们将其命名为LAGNet。它是专门为类药物物质和密度泛函理论(DFT)数据集设计和调整的。