Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, FL 34987, USA.
Chem Biol Drug Des. 2012 Nov;80(5):717-24. doi: 10.1111/cbdd.12011. Epub 2012 Aug 31.
Natural products represent important sources of bioactive compounds in drug discovery efforts. In this work, we compiled five natural products databases available in the public domain and performed a comprehensive chemoinformatic analysis focused on the content and diversity of the scaffolds with an overview of the diversity based on molecular fingerprints. The natural products databases were compared with each other and with a set of molecules obtained from in-house combinatorial libraries, and with a general screening commercial library. It was found that publicly available natural products databases have different scaffold diversity. In contrast to the common concept that larger libraries have the largest scaffold diversity, the largest natural products collection analyzed in this work was not the most diverse. The general screening library showed, overall, the highest scaffold diversity. However, considering the most frequent scaffolds, the general reference library was the least diverse. In general, natural products databases in the public domain showed low molecule overlap. In addition to benzene and acyclic compounds, flavones, coumarins, and flavanones were identified as the most frequent molecular scaffolds across the different natural products collections. The results of this work have direct implications in the computational and experimental screening of natural product databases for drug discovery.
天然产物是药物发现工作中生物活性化合物的重要来源。在这项工作中,我们编译了五个公共领域可用的天然产物数据库,并进行了全面的 chemoinformatic 分析,重点关注支架的内容和多样性,并基于分子指纹概述了多样性。天然产物数据库相互之间以及与一组来自内部组合库的分子以及一般的商业筛选库进行了比较。结果发现,公开的天然产物数据库具有不同的支架多样性。与较大的库具有最大支架多样性的常见概念相反,在这项工作中分析的最大天然产物集合并不是最多样化的。总体而言,通用筛选库显示出最高的支架多样性。然而,考虑到最常见的支架,通用参考库的多样性最低。一般来说,公共领域的天然产物数据库显示出较低的分子重叠。除了苯和无环化合物外,黄酮类、香豆素类和黄烷酮类被确定为不同天然产物集合中最常见的分子支架。这项工作的结果直接影响药物发现中天然产物数据库的计算和实验筛选。