University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081 France.
Institute of Organic Chemistry National Academy of Sciences of Ukraine, Murmanska Street 5, Kyiv 02660, Ukraine.
J Chem Inf Model. 2021 Jan 25;61(1):179-188. doi: 10.1021/acs.jcim.0c00936. Epub 2020 Dec 17.
The days when medicinal chemistry was limited to a few series of compounds of therapeutic interest are long gone. Nowadays, no human may succeed to acquire a complete overview of more than a billion existing or feasible compounds within which the potential "blockbuster drugs" are well hidden and yet only a few mouse clicks away. To reach these "hidden treasures", we adapted the generative topographic mapping method to enable efficient navigation through the chemical space, from a global overview to a structural pattern detection, covering, for the first time, the complete ZINC library of purchasable compounds, relative to 1.6 million biologically relevant ChEMBL molecules. About 40 000 hierarchical maps of the chemical space were constructed. Structural motifs inherent to only one library were identified. Roughly 20 000 off-market ChEMBL compound families represent incentives to enrich commercial catalogs. Alternatively, 125 000 ZINC-specific compound classes, absent in structure-activity bases, are novel paths to explore in medicinal chemistry. The complete list of these chemotypes can be downloaded using the link https://forms.gle/B6bUJj82t9EfmttV6.
药用化学的时代早已一去不复返,那时药用化学的研究范围仅限于少数具有治疗意义的化合物系列。如今,即使是人类也难以全面了解现有的或可行的超过 10 亿种化合物,而具有巨大潜力的“重磅炸弹药物”就隐藏在其中,只需点击几下鼠标就能找到。为了找到这些“隐藏的宝藏”,我们采用了生成式拓扑映射方法,使我们能够高效地在化学空间中进行导航,从全局概览到结构模式检测,首次涵盖了可购买化合物的完整 ZINC 库,相关化合物数量达到 160 万种,而生物相关的 ChEMBL 分子数量为 40 万。构建了约 40000 张化学空间的层次图。确定了只存在于一个库中的固有结构基序。大约 20000 个非上市的 ChEMBL 化合物家族代表了丰富商业目录的激励因素。或者,125000 个 ZINC 特有的化合物类别在结构活性基础中不存在,这为药用化学探索提供了新的途径。这些类药性化合物的完整列表可以使用以下链接下载:https://forms.gle/B6bUJj82t9EfmttV6。