Bioinformatics Group, Department of Plant Sciences, Wageningen University & Research, Radix Building, Droevendaalsesteeg 1, Wageningen, 6708PB, the Netherlands.
Department of Analytical Chemistry, University of Vienna, Vienna 1090, Austria.
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae584.
Computational metabolomics workflows have revolutionized the untargeted metabolomics field. However, the organization and prioritization of metabolite features remains a laborious process. Organizing metabolomics data is often done through mass fragmentation-based spectral similarity grouping, resulting in feature sets that also represent an intuitive and scientifically meaningful first stage of analysis in untargeted metabolomics. Exploiting such feature sets, feature-set testing has emerged as an approach that is widely used in genomics and targeted metabolomics pathway enrichment analyses. It allows for formally combining groupings with statistical testing into more meaningful pathway enrichment conclusions. Here, we present msFeaST (mass spectral Feature Set Testing), a feature-set testing and visualization workflow for LC-MS/MS untargeted metabolomics data. Feature-set testing involves statistically assessing differential abundance patterns for groups of features across experimental conditions. We developed msFeaST to make use of spectral similarity-based feature groupings generated using k-medoids clustering, where the resulting clusters serve as a proxy for grouping structurally similar features with potential biosynthesis pathway relationships. Spectral clustering done in this way allows for feature group-wise statistical testing using the globaltest package, which provides high power to detect small concordant effects via joint modeling and reduced multiplicity adjustment penalties. Hence, msFeaST provides interactive integration of the semi-quantitative experimental information with mass-spectral structural similarity information, enhancing the prioritization of features and feature sets during exploratory data analysis.
The msFeaST workflow is freely available through https://github.com/kevinmildau/msFeaST and built to work on MacOS and Linux systems.
计算代谢组学工作流程彻底改变了非靶向代谢组学领域。然而,代谢物特征的组织和优先级仍然是一个繁琐的过程。代谢组学数据的组织通常通过基于质量碎片化的光谱相似性分组来完成,从而产生的特征集也代表了非靶向代谢组学中直观且具有科学意义的分析的第一阶段。利用这些特征集,特征集测试已经成为一种广泛应用于基因组学和靶向代谢组学途径富集分析的方法。它允许将分组与统计测试正式结合到更有意义的途径富集结论中。在这里,我们提出了 msFeaST(质谱特征集测试),这是一种用于 LC-MS/MS 非靶向代谢组学数据的特征集测试和可视化工作流程。特征集测试涉及统计评估特征组在实验条件下的差异丰度模式。我们开发了 msFeaST 来利用基于 k-medoids 聚类生成的基于光谱相似性的特征分组,其中生成的聚类作为具有潜在生物合成途径关系的结构相似特征的分组代理。以这种方式进行的光谱聚类允许使用 globaltest 包对特征组进行统计测试,该包通过联合建模和减少多重调整惩罚来提供检测小一致性效应的高功效。因此,msFeaST 提供了半定量实验信息与质量光谱结构相似性信息的交互式集成,增强了在探索性数据分析过程中对特征和特征集的优先级排序。
msFeaST 工作流程可通过 https://github.com/kevinmildau/msFeaST 免费获得,并构建为在 MacOS 和 Linux 系统上运行。