Natural Products Chemistry Laboratory, Centro de Pesquisa René Rachou-Fiocruz , Belo Horizonte, 30190-002, MG, Brazil.
Griffith School of Environment, Griffith University , Gold Coast Campus, Southport, QLD 4222, Australia.
J Nat Prod. 2017 Jun 23;80(6):1758-1766. doi: 10.1021/acs.jnatprod.6b01093. Epub 2017 Jun 15.
The discovery of novel and/or new bioactive natural products from biota sources is often confounded by the reisolation of known natural products. Dereplication strategies that involve the analysis of NMR and MS spectroscopic data to infer structural features present in purified natural products in combination with database searches of these substructures provide an efficient method to rapidly identify known natural products. Unfortunately this strategy has been hampered by the lack of publically available and comprehensive natural product databases and open source cheminformatics tools. A new platform, DEREP-NP, has been developed to help solve this problem. DEREP-NP uses the open source cheminformatics program DataWarrior to generate a database containing counts of 65 structural fragments present in 229 358 natural product structures derived from plants, animals, and microorganisms, published before 2013 and freely available in the nonproprietary Universal Natural Products Database (UNPD). By counting the number of times one or more of these structural features occurs in an unknown compound, as deduced from the analysis of its NMR (H, HSQC, and/or HMBC) and/or MS data, matching structures carrying the same numeric combination of searched structural features can be retrieved from the database. Confirmation that the matching structure is the same compound can then be verified through literature comparison of spectroscopic data. This methodology can be applied to both purified natural products and fractions containing a small number of individual compounds that are often generated as screening libraries. The utility of DEREP-NP has been verified through the analysis of spectra derived from compounds (and fractions containing two or three compounds) isolated from plant, marine invertebrate, and fungal sources. DEREP-NP is freely available at https://github.com/clzani/DEREP-NP and will help to streamline the natural product discovery process.
从生物资源中发现新的或具有生物活性的天然产物,往往会受到已知天然产物的再分离的困扰。涉及分析 NMR 和 MS 波谱数据以推断在纯化的天然产物中存在的结构特征,并结合对这些亚结构的数据库搜索的去重策略,提供了一种快速识别已知天然产物的有效方法。不幸的是,这种策略受到缺乏公开的和全面的天然产物数据库和开源化学信息学工具的阻碍。已经开发了一个新的平台 DEREP-NP 来帮助解决这个问题。DEREP-NP 使用开源化学信息学程序 DataWarrior 生成一个数据库,其中包含 229 358 种植物、动物和微生物来源的天然产物结构中存在的 65 种结构片段的计数,这些结构片段是在 2013 年之前发表的,并且可以在非专有通用天然产物数据库 (UNPD) 中免费获得。通过计算一个或多个未知化合物的 NMR(H、HSQC 和/或 HMBC)和/或 MS 数据推断出的这些结构特征出现的次数,可以从数据库中检索到携带相同搜索结构特征数字组合的匹配结构。然后可以通过比较文献中的光谱数据来验证匹配结构是否相同化合物。该方法可应用于纯化的天然产物和包含少量单个化合物的馏分,这些馏分通常作为筛选文库产生。通过对从植物、海洋无脊椎动物和真菌来源的化合物(和包含两个或三个化合物的馏分)衍生的光谱进行分析,验证了 DEREP-NP 的实用性。DEREP-NP 可在 https://github.com/clzani/DEREP-NP 上免费获得,并将有助于简化天然产物发现过程。