Suppr超能文献

从结构数据库中选择最优多样化的化合物:二维和三维分子描述符的验证研究

Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors.

作者信息

Matter H

机构信息

TRIPOS GmbH, München, Germany.

出版信息

J Med Chem. 1997 Apr 11;40(8):1219-29. doi: 10.1021/jm960352+.

Abstract

The efficiency of the drug discovery process can be significantly improved using design techniques to maximize the diversity of structure databases or combinatorial libraries. Here, several physicochemical descriptors were investigated to quantify molecular diversity. Based on the 2D or 3D topological similarity of molecules, the relationship between physicochemical metrics and biological activity was studied to find valid descriptors. Several compounds were selected using those descriptors from a database containing diverse templates and 55 biological classes. It was evaluated whether the obtained subsets represent all biological properties and structural variations of the original database. In addition, hierarchical cluster analyses were used to group molecules from the parent database, which should have similar biological properties. Using various sets of structurally similar molecules, it was possible to derive quantitative measures for compound similarities in relation to biological properties. A similarity radius for 2D fingerprints and molecular steric fields was estimated; compounds within this radius of another molecule were shown to have comparable biological properties. This study demonstrates that 2D fingerprints alone or in combination with other metrics as the primary descriptor allow to handle global diversity. In addition, standard atom-pair descriptors or molecular steric fields can be used to correlate structural diversity with biological activity. Hence, the latter two descriptors can be classified as secondary descriptors useful for analog library design, while 2D fingerprints are applicable to design a general library for lead discovery. Based on these findings, an optimally diverse subset containing only 38% of the entire IC93 database was generated using 2D fingerprints. Here no structure is more similar than 0.85 to any other (Tanimoto coefficient), but all biological classes were selected. This reduction of redundancy led to a child database with the same physicochemical diversity space, which contains the same information as the original database.

摘要

使用设计技术来最大化结构数据库或组合文库的多样性,可以显著提高药物发现过程的效率。在此,研究了几种物理化学描述符以量化分子多样性。基于分子的二维或三维拓扑相似性,研究了物理化学指标与生物活性之间的关系,以找到有效的描述符。使用这些描述符从一个包含多种模板和55个生物类别的数据库中选择了几种化合物。评估了所获得的子集是否代表了原始数据库的所有生物学特性和结构变异。此外,使用层次聚类分析对来自母数据库的分子进行分组,这些分子应具有相似的生物学特性。使用各种结构相似的分子集,可以得出与生物学特性相关的化合物相似性的定量测量值。估计了二维指纹和分子空间场的相似性半径;另一个分子这个半径范围内的化合物显示具有可比的生物学特性。这项研究表明,单独使用二维指纹或与其他指标结合作为主要描述符可以处理全局多样性。此外,标准原子对描述符或分子空间场可用于将结构多样性与生物活性相关联。因此,后两个描述符可归类为对类似物文库设计有用的次要描述符,而二维指纹适用于设计用于先导发现的通用文库。基于这些发现,使用二维指纹生成了一个仅包含整个IC93数据库38%的最优多样化子集。这里没有任何结构与其他结构的相似性超过0.85(Tanimoto系数),但所有生物类别都被选中。这种冗余的减少导致了一个具有相同物理化学多样性空间的子数据库,它包含与原始数据库相同的信息。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验