Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads 221, DK-2800 Kongens Lyngby, Denmark.
Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
Nat Prod Rep. 2021 Nov 17;38(11):2066-2082. doi: 10.1039/d1np00040c.
Covering: 2016 up to 2021Mass spectrometry (MS) is an essential technology in natural products research with MS fragmentation (MS/MS) approaches becoming a key tool. Recent advancements in MS yield dense metabolomics datasets which have been, conventionally, used by individual labs for individual projects; however, a shift is brewing. The movement towards open MS data (and other structural characterization data) and accessible data mining tools is emerging in natural products research. Over the past 5 years, this movement has rapidly expanded and evolved with no slowdown in sight; the capabilities of today vastly exceed those of 5 years ago. Herein, we address the analysis of individual datasets, a situation we are calling the '2021 status quo', and the emergent framework to systematically capture sample information (metadata) and perform repository-scale analyses. We evaluate public data deposition, discuss the challenges of working in the repository scale, highlight the challenges of metadata capture and provide illustrative examples of the power of utilizing repository data and the tools that enable it. We conclude that the advancements in MS data collection must be met with advancements in how we utilize data; therefore, we argue that open data and data mining is the next evolution in obtaining the maximum potential in natural products research.
2016 年至 2021 年
质谱(MS)是天然产物研究中的一项重要技术,MS 碎片化(MS/MS)方法已成为一种关键工具。MS 的最新进展产生了密集的代谢组学数据集,这些数据集传统上由各个实验室用于各个项目;然而,正在酝酿着转变。开放 MS 数据(和其他结构特征数据)和可访问的数据挖掘工具的运动正在天然产物研究中兴起。在过去的 5 年中,这种运动迅速扩展和发展,没有放缓的迹象;如今的能力远远超过了 5 年前。在此,我们将讨论分析单个数据集的情况,我们称之为“2021 年的现状”,以及系统地捕获样本信息(元数据)和进行存储库规模分析的新兴框架。我们评估了公共数据的存储库,讨论了在存储库规模上工作的挑战,强调了元数据捕获的挑战,并提供了利用存储库数据和支持其的工具的强大功能的示例。我们得出的结论是,MS 数据收集的进步必须与我们如何利用数据的进步相匹配;因此,我们认为开放数据和数据挖掘是在天然产物研究中获得最大潜力的下一步发展。