CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy.
Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.
J Chem Inf Model. 2022 Mar 28;62(6):1411-1424. doi: 10.1021/acs.jcim.2c00205. Epub 2022 Mar 16.
In this paper, we present a deep learning algorithm for automated design of druglike analogues (DeLA-Drug), a recurrent neural network (RNN) model composed of two long short-term memory (LSTM) layers and conceived for data-driven generation of similar-to-bioactive compounds. DeLA-Drug captures the syntax of SMILES strings of more than 1 million compounds belonging to the ChEMBL28 database and, by employing a new strategy called sampling with substitutions (SWS), generates molecules starting from a single user-defined query compound. Remarkably, the algorithm preserves druglikeness and synthetic accessibility of the known bioactive compounds present in the ChEMBL28 repository. The absence of any time-demanding fine-tuning procedure enables DeLA-Drug to perform a fast generation of focused libraries for further high-throughput screening and makes it a suitable tool for performing design even in low-data regimes. To provide a concrete idea of its applicability, DeLA-Drug was applied to the cannabinoid receptor subtype 2 (CB2R), a known target involved in different pathological conditions such as cancer and neurodegeneration. DeLA-Drug, available as a free web platform (http://www.ba.ic.cnr.it/softwareic/deladrugportal/), can help medicinal chemists interested in generating analogues of compounds already available in their laboratories and, for this reason, good candidates for an easy and low-cost synthesis.
在本文中,我们提出了一种用于自动化设计类药物类似物(DeLA-Drug)的深度学习算法,这是一个由两个长短时记忆(LSTM)层组成的递归神经网络(RNN)模型,旨在通过数据驱动生成类似生物活性的化合物。DeLA-Drug 捕捉了属于 ChEMBL28 数据库的超过 100 万个化合物的 SMILES 字符串的语法,并通过采用一种称为带替代的采样(SWS)的新策略,从单个用户定义的查询化合物开始生成分子。值得注意的是,该算法保留了 ChEMBL28 存储库中已知生物活性化合物的类药性和合成可及性。由于没有任何耗时的微调过程,DeLA-Drug 能够快速生成针对进一步高通量筛选的重点库,使其成为在数据量较少的情况下进行设计的合适工具。为了提供其适用性的具体想法,我们将 DeLA-Drug 应用于大麻素受体亚型 2(CB2R),这是一种已知的参与不同病理条件的靶点,如癌症和神经退行性变。DeLA-Drug 可作为一个免费的网络平台(http://www.ba.ic.cnr.it/softwareic/deladrugportal/)使用,它可以帮助对生成已在实验室中可用化合物的类似物感兴趣的药物化学家,并且由于这个原因,它是一种易于且低成本合成的良好候选物。