Suppr超能文献

化学空间计划。

The chemical space project.

机构信息

Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland.

出版信息

Acc Chem Res. 2015 Mar 17;48(3):722-30. doi: 10.1021/ar500432k. Epub 2015 Feb 17.

Abstract

One of the simplest questions that can be asked about molecular diversity is how many organic molecules are possible in total? To answer this question, my research group has computationally enumerated all possible organic molecules up to a certain size to gain an unbiased insight into the entire chemical space. Our latest database, GDB-17, contains 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens, by far the largest small molecule database reported to date. Molecules allowed by valency rules but unstable or nonsynthesizable due to strained topologies or reactive functional groups were not considered, which reduced the enumeration by at least 10 orders of magnitude and was essential to arrive at a manageable database size. Despite these restrictions, GDB-17 is highly relevant with respect to known molecules. Beyond enumeration, understanding and exploiting GDBs (generated databases) led us to develop methods for virtual screening and visualization of very large databases in the form of a "periodic system of molecules" comprising six different fingerprint spaces, with web-browsers for nearest neighbor searches, and the MQN- and SMIfp-Mapplet application for exploring color-coded principal component maps of GDB and other large databases. Proof-of-concept applications of GDB for drug discovery were realized by combining virtual screening with chemical synthesis and activity testing for neurotransmitter receptor and transporter ligands. One surprising lesson from using GDB for drug analog searches is the incredible depth of chemical space, that is, the fact that millions of very close analogs of any molecule can be readily identified by nearest-neighbor searches in the MQN-space of the various GDBs. The chemical space project has opened an unprecedented door on chemical diversity. Ongoing and yet unmet challenges concern enumerating molecules beyond 17 atoms and synthesizing GDB molecules with innovative scaffolds and pharmacophores.

摘要

关于分子多样性,最简单的问题之一是总共可能有多少种有机分子?为了回答这个问题,我的研究小组通过计算枚举了所有可能的有机分子,直到一定的大小,从而获得对整个化学空间的无偏洞察。我们最新的数据库 GDB-17 包含 1664 亿个最多 17 个原子的 C、N、O、S 和卤素分子,这是迄今为止报道的最大小分子数据库。不考虑价规则允许但由于拓扑结构紧张或反应性功能基团而不稳定或不可合成的分子,这至少减少了 10 个数量级的枚举,对于达到可管理的数据库大小是必不可少的。尽管存在这些限制,GDB-17 与已知分子高度相关。除了枚举,对 GDB(生成的数据库)的理解和利用使我们能够开发方法,以“分子的周期性系统”的形式对非常大的数据库进行虚拟筛选和可视化,该系统包含六个不同的指纹空间,带有网络浏览器进行最近邻搜索,以及 MQN-和 SMIfp-Mapplet 应用程序,用于探索 GDB 和其他大型数据库的彩色主成分图。通过将虚拟筛选与神经递质受体和转运体配体的化学合成和活性测试相结合,实现了 GDB 在药物发现中的概念验证应用。从使用 GDB 进行药物类似物搜索中获得的一个惊人的教训是化学空间的深度令人难以置信,也就是说,通过各种 GDB 的 MQN 空间中的最近邻搜索,可以轻松识别任何分子的数百万个非常接近的类似物。化学空间项目为化学多样性打开了前所未有的大门。正在进行且尚未满足的挑战涉及枚举超过 17 个原子的分子,并合成具有创新支架和药效团的 GDB 分子。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验