Zheng Shuangjia, Yan Xin, Gu Qiong, Yang Yuedong, Du Yunfei, Lu Yutong, Xu Jun
Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China.
National Supercomputer Center in Guangzhou and School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China.
J Cheminform. 2019 Jan 17;11(1):5. doi: 10.1186/s13321-019-0328-9.
Biogenic compounds are important materials for drug discovery and chemical biology. In this work, we report a quasi-biogenic molecule generator (QBMG) to compose virtual quasi-biogenic compound libraries by means of gated recurrent unit recurrent neural networks. The library includes stereo-chemical properties, which are crucial features of natural products. QMBG can reproduce the property distribution of the underlying training set, while being able to generate realistic, novel molecules outside of the training set. Furthermore, these compounds are associated with known bioactivities. A focused compound library based on a given chemotype/scaffold can also be generated by this approach combining transfer learning technology. This approach can be used to generate virtual compound libraries for pharmaceutical lead identification and optimization.
生物源化合物是药物发现和化学生物学的重要材料。在这项工作中,我们报告了一种准生物源分子生成器(QBMG),它通过门控循环单元递归神经网络来构建虚拟准生物源化合物库。该库包含立体化学性质,这是天然产物的关键特征。QBMG能够重现基础训练集的性质分布,同时能够生成训练集之外的真实、新颖的分子。此外,这些化合物与已知的生物活性相关。通过结合迁移学习技术的这种方法,还可以生成基于给定化学类型/骨架的聚焦化合物库。这种方法可用于生成虚拟化合物库,以进行药物先导物的识别和优化。