Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States.
J Chem Inf Model. 2023 May 8;63(9):2656-2666. doi: 10.1021/acs.jcim.2c01526. Epub 2023 Apr 19.
Advances in deep neural networks (DNNs) have made a very powerful machine learning method available to researchers across many fields of study, including the biomedical and cheminformatics communities, where DNNs help to improve tasks such as protein performance, molecular design, drug discovery, etc. Many of those tasks rely on molecular descriptors for representing molecular characteristics in cheminformatics. Despite significant efforts and the introduction of numerous methods that derive molecular descriptors, the quantitative prediction of molecular properties remains challenging. One widely used method of encoding molecule features into bit strings is the molecular fingerprint. In this work, we propose using new Neumann-Cayley Gated Recurrent Units (NC-GRU) inside the Neural Nets encoder (AutoEncoder) to create neural molecular fingerprints (NC-GRU fingerprints). The NC-GRU AutoEncoder introduces orthogonal weights into widely used GRU architecture, resulting in faster, more stable training, and more reliable molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task DNN schematics improves the performance of various molecular-related tasks such as toxicity, partition coefficient, lipophilicity, and solvation-free energy, producing state-of-the-art results on several benchmarks.
深度学习神经网络 (DNN) 的发展为许多研究领域的研究人员提供了一种非常强大的机器学习方法,包括生物医学和化学信息学领域,在这些领域,DNN 有助于提高蛋白质性能、分子设计、药物发现等任务的效率。许多这些任务都依赖于分子描述符来表示化学信息学中的分子特征。尽管已经做出了巨大的努力并引入了许多衍生分子描述符的方法,但定量预测分子性质仍然具有挑战性。一种将分子特征编码为位字符串的常用方法是分子指纹。在这项工作中,我们建议在神经网络编码器 (AutoEncoder) 中使用新的诺伊曼-凯莱门控循环单元 (NC-GRU) 来创建神经分子指纹 (NC-GRU 指纹)。NC-GRU AutoEncoder 将正交权重引入到广泛使用的 GRU 架构中,从而实现更快、更稳定的训练和更可靠的分子指纹。整合新颖的 NC-GRU 指纹和多任务 DNN 原理图可提高毒性、分配系数、亲脂性和溶剂自由能等各种与分子相关的任务的性能,在几个基准测试中取得了最先进的结果。