School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA.
Baylor College of Medicine, Houston, TX, USA.
Sci Rep. 2018 Jun 18;8(1):9291. doi: 10.1038/s41598-018-27031-x.
Mass spectrometry-based metabolomics has undergone significant progresses in the past decade, with a variety of software packages being developed for data analysis. However, systematic comparison of different metabolomics software tools has rarely been conducted. In this study, several representative software packages were comparatively evaluated throughout the entire pipeline of metabolomics data analysis, including data processing, statistical analysis, feature selection, metabolite identification, pathway analysis, and classification model construction. LC-MS-based metabolomics was applied to preclinical Alzheimer's disease (AD) using a small cohort of human cerebrospinal fluid (CSF) samples (N = 30). All three software packages, XCMS Online, SIEVE, and Compound Discoverer, provided consistent and reproducible data processing results. A hybrid method combining statistical test and support vector machine feature selection was employed to screen key metabolites, achieving a complementary selection of candidate biomarkers from three software packages. Machine learning classification using candidate biomarkers generated highly accurate and predictive models to classify patients into preclinical AD or control category. Overall, our study demonstrated a systematic evaluation of different MS-based metabolomics software packages for the entire data analysis pipeline which was applied to the candidate biomarker discovery of preclinical AD.
基于质谱的代谢组学在过去十年中取得了重大进展,开发了多种软件包用于数据分析。然而,不同代谢组学软件工具的系统比较很少进行。在这项研究中,我们在代谢组学数据分析的整个流程中对几种有代表性的软件包进行了比较评估,包括数据处理、统计分析、特征选择、代谢物鉴定、途径分析和分类模型构建。我们使用一小批人类脑脊液 (CSF) 样本 (N = 30) 应用基于 LC-MS 的代谢组学来研究临床前阿尔茨海默病 (AD)。所有三个软件包,XCMS Online、SIEVE 和 Compound Discoverer,都提供了一致且可重复的数据处理结果。我们采用了一种结合统计检验和支持向量机特征选择的混合方法来筛选关键代谢物,从三个软件包中互补选择候选生物标志物。使用候选生物标志物进行机器学习分类可以生成高度准确和有预测性的模型,将患者分为临床前 AD 或对照类别。总的来说,我们的研究对整个数据分析流程的不同基于 MS 的代谢组学软件包进行了系统评估,并将其应用于临床前 AD 的候选生物标志物发现。