Szwarc Sarah, Rutz Adriano, Lee Kyungha, Mejri Yassine, Bonnet Olivier, Hazni Hazrina, Jagora Adrien, Mbeng Obame Rany B, Noh Jin Kyoung, Otogo N'Nang Elvis, Alaribe Stephenie C, Awang Khalijah, Bernadat Guillaume, Choi Young Hae, Courdavault Vincent, Frederich Michel, Gaslonde Thomas, Huber Florian, Kam Toh-Seok, Low Yun Yee, Poupon Erwan, van der Hooft Justin J J, Kang Kyo Bin, Le Pogam Pierre, Beniddir Mehdi A
Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France.
Institute of Molecular Systems Biology, ETH Zürich, 8093, Zurich, Switzerland.
J Cheminform. 2025 Apr 28;17(1):62. doi: 10.1186/s13321-025-01009-0.
With over 3000 representatives, the monoterpene indole alkaloids (MIAs) class is among the most diverse families of plant natural products. The MS/MS spectral space exploration of these complex compounds using chemoinformatic and computational mass spectrometry tools offers a valuable opportunity to extract and share chemical insights from this emblematic family of natural products (NPs). In this work, we first present a substantially updated version of the MIADB, a database now containing 422 MS/MS spectra of MIAs that has been uploaded to the GNPS library versus 172 initial entries. We then introduce an innovative workflow that leverages hundreds of fragmentation spectra to support the FAIRification, extraction and dissemination of chemical knowledge. This workflow aims at the extraction of spectral patterns matching finely defined MIA skeletons. These extracted signatures can then be queried against complex biological extract datasets using MassQL. By applying this strategy to an LC-MS/MS dataset of 75 plant extracts, our results demonstrated the efficiency of this approach in identifying the diversity of MIA skeletons present in the analyzed samples. Additionally, our work enabled the digitization of structural data for diverse MIA skeletons by converting them into machine-readable formats and thereby enhancing their dissemination for the scientific community.Scientific contribution A comprehensive investigation of the monoterpene indole alkaloid chemical space, aiming to highlight skeleton-dependent fragmentation similarity trends and to generate valuable spectrometric signatures that could be used as queries.
单萜吲哚生物碱(MIA)类化合物拥有超过3000种代表物,是植物天然产物中最多样化的家族之一。使用化学信息学和计算质谱工具对这些复杂化合物进行MS/MS光谱空间探索,为从这个标志性的天然产物(NP)家族中提取和分享化学见解提供了宝贵机会。在这项工作中,我们首先展示了MIADB的一个大幅更新版本,该数据库现在包含422个MIA的MS/MS光谱,已上传至GNPS库,而最初只有172个条目。然后,我们引入了一种创新的工作流程,利用数百个碎片光谱来支持化学知识的FAIR化、提取和传播。此工作流程旨在提取与精细定义的MIA骨架相匹配的光谱模式。然后可以使用MassQL针对复杂的生物提取物数据集查询这些提取的特征。通过将这种策略应用于75种植物提取物的LC-MS/MS数据集,我们的结果证明了该方法在识别分析样品中存在的MIA骨架多样性方面的效率。此外,我们的工作通过将各种MIA骨架的结构数据转换为机器可读格式,实现了其数字化,从而加强了它们在科学界的传播。科学贡献:对单萜吲哚生物碱化学空间进行全面研究,旨在突出骨架依赖性碎片相似性趋势,并生成可作为查询的有价值的光谱特征。