Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City, 04510, Mexico.
Mol Inform. 2020 Nov;39(11):e2000050. doi: 10.1002/minf.202000050. Epub 2020 Apr 29.
We report a comprehensive fragment library with 205,903 fragments derived from the recently published Collection of Open Natural Products (COCONUT) data set with more than 400,000 non-redundant natural products. The natural products-based fragment library was compared with other two fragment libraries herein generated from ChEMBL (biologically relevant compounds) and Enamine-REAL (a large on-demand collection of synthetic compounds), both used as reference data sets with relevance in drug discovery. It was found that there is a large diversity of unique fragments derived from natural products and that the entire structures and fragments derived from natural products are more diverse and structurally complex than the two reference compound collections. During this work we introduced a novel visual representation of the chemical space based on the recently published concept of statistical-based database fingerprint. The compounds and fragments libraries from natural products generated and analyzed in this work are freely available.
我们报告了一个综合的片段库,其中包含了 205903 个片段,这些片段来源于最近发布的包含 40 多万个非冗余天然产物的开放式天然产物集合(COCONUT)数据集。该天然产物为基础的片段库与其他两个片段库进行了比较,这两个片段库是从 ChEMBL(生物相关化合物)和 Enamine-REAL(一个大型按需合成化合物集合)中生成的,都作为在药物发现中具有相关性的参考数据集。研究发现,从天然产物中衍生出的独特片段具有很大的多样性,而且天然产物衍生的整个结构和片段比这两个参考化合物集合更具多样性和结构复杂性。在这项工作中,我们引入了一种基于最近发表的基于统计的数据库指纹概念的新型化学空间可视化表示。在这项工作中生成和分析的天然产物化合物和片段库均可免费获取。