Oprea T I, Gottfries J
EST Lead Informatics and Medicinal Chemistry, AstraZeneca R&D Mölndal, S-43183 Mölndal, Sweden.
J Comb Chem. 2001 Mar-Apr;3(2):157-66. doi: 10.1021/cc0000388.
Combinatorial chemistry needs focused molecular diversity applied to the druglike chemical space (drugspace). A drugspace map can be obtained by systematically applying the same conventions when examining the chemical space, in a manner similar to the Mercator convention in geography: Rules are equivalent to dimensions (e.g., longitude and latitude), while structures are equivalent to objects (e.g., cities and countries). Selected rules include size, lipophilicity, polarizability, charge, flexibility, rigidity, and hydrogen bond capacity. For these, extreme values were set, e.g., maximum molecular weight 1500, calculated negative logarithm of the octanol/water partition between -10 and 20, and up to 30 nonterminal rotatable bonds. Only S, N, O, P, and halogens were considered as elements besides C and H. Selected objects include a set of "satellite" structures and a set of representative drugs ("core" structures). Satellites, intentionally placed outside drugspace, have extreme values in one or several of the desired properties, while containing druglike chemical fragments. ChemGPS (chemical global positioning system) is a tool that combines these predefined rules and objects to provide a global drugspace map. The ChemGPS drugspace map coordinates are t-scores extracted via principal component analysis (PCA) from 72 descriptors that evaluate the above-mentioned rules on a total set of 423 satellite and core structures. Global ChemGPS scores describe well the latent structures extracted with PCA for a set of 8599 monocarboxylates, a set of 45 heteroaromatic compounds, and for 87 alpha-amino acids. ChemGPS positions novel structures in drugspace via PCA-score prediction, providing a unique mapping device for the druglike chemical space. ChemGPS scores are comparable across a large number of chemicals and do not change as new structures are predicted, making this tool a well-suited reference system for comparing multiple libraries and for keeping track of previously explored regions of the chemical space.
组合化学需要将聚焦的分子多样性应用于类药化学空间(药物空间)。通过在检查化学空间时系统地应用相同的惯例,可以获得药物空间图,其方式类似于地理中的墨卡托投影惯例:规则等同于维度(例如,经度和纬度),而结构等同于物体(例如,城市和国家)。选定的规则包括大小、亲脂性、极化率、电荷、柔韧性、刚性和氢键能力。对于这些规则,设定了极值,例如最大分子量为1500,计算得到的辛醇/水分配系数的负对数在-10至20之间,以及最多30个非末端可旋转键。除了碳和氢之外,仅将硫、氮、氧、磷和卤素视为元素。选定的物体包括一组“卫星”结构和一组代表性药物(“核心”结构)。卫星有意放置在药物空间之外,在一种或几种所需性质上具有极值,同时包含类药化学片段。ChemGPS(化学全球定位系统)是一种工具,它结合了这些预定义的规则和物体,以提供全球药物空间图。ChemGPS药物空间图坐标是通过主成分分析(PCA)从72个描述符中提取的t分数,这些描述符在总共423个卫星和核心结构上评估上述规则。全局ChemGPS分数很好地描述了通过PCA为一组8599种单羧酸盐、一组45种杂芳族化合物和87种α-氨基酸提取的潜在结构。ChemGPS通过PCA分数预测在药物空间中定位新结构,为类药化学空间提供了一种独特的映射工具。ChemGPS分数在大量化学物质之间具有可比性,并且不会随着新结构的预测而改变,使得该工具成为比较多个库以及跟踪化学空间中先前探索区域的合适参考系统。