Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
Novartis Institutes for Biomedical Research, Basel, Switzerland.
Mol Inform. 2019 Aug;38(8-9):e1900031. doi: 10.1002/minf.201900031. Epub 2019 Jun 6.
The generated database GDB17 enumerates 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogens following simple chemical stability and synthetic feasibility rules, however medicinal chemistry criteria are not taken into account. Here we applied rules inspired by medicinal chemistry to exclude problematic functional groups and complex molecules from GDB17, and sampled the resulting subset uniformly across molecular size, stereochemistry and polarity to form GDBMedChem as a compact collection of 10 million small molecules. This collection has reduced complexity and better synthetic accessibility than the entire GDB17 but retains higher sp -carbon fraction and natural product likeness scores compared to known drugs. GDBMedChem molecules are more diverse and very different from known molecules in terms of substructures and represent an unprecedented source of diversity for drug design. GDBMedChem is available for 3D-visualization, similarity searching and for download at http://gdb.unibe.ch.
生成的数据库 GDB17 按照简单的化学稳定性和合成可行性规则,枚举了多达 1664 亿种可能的包含 17 个 C、N、O、S 和卤素原子的分子,但没有考虑药物化学标准。在这里,我们应用受药物化学启发的规则,从 GDB17 中排除有问题的功能基团和复杂分子,并在分子大小、立体化学和极性方面对生成的子集进行均匀采样,形成 GDBMedChem,这是一个包含 1000 万个小分子的紧凑集合。与整个 GDB17 相比,该集合的复杂性降低,合成可及性更好,但与已知药物相比,具有更高的 sp3 碳原子分数和天然产物相似性评分。GDBMedChem 分子在结构上与已知分子更加多样化,并且在亚结构方面与已知分子非常不同,代表了药物设计中前所未有的多样性来源。GDBMedChem 可在 http://gdb.unibe.ch 进行 3D 可视化、相似性搜索和下载。