Scheltema Ra, Decuypere S, Dujardin Jc, Watson Dg, Jansen Rc, Breitling R
Groningen Bioinformatics Centre, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands.
Bioanalysis. 2009 Dec;1(9):1551-7. doi: 10.4155/bio.09.146.
Metabolomics LC-MS experiments yield large numbers of peaks, few of which can be identified by database matching. Many of the remaining peaks correspond to derivatives of identified peaks (e.g., isotope peaks, adducts, fragments and multiply charged molecules). In this article, we present a data-reduction approach that automatically identifies these derivative peaks.
Using data-driven clustering based on chromatographic peak shape correlation and intensity patterns across biological replicates, derivative peaks can be reliably identified. Using a test data set obtained from Leishmania donovani extracts, we achieved a 60% reduction of the number of peaks. After quality control filtering, almost 80% of the peaks could putatively be identified by database matching.
Automated peak filtering substantially speeds up the data-interpretation process.
代谢组学液相色谱 - 质谱实验会产生大量峰,其中很少能通过数据库匹配鉴定出来。其余许多峰对应于已鉴定峰的衍生物(例如,同位素峰、加合物、碎片和多电荷分子)。在本文中,我们提出了一种数据简化方法,可自动识别这些衍生峰。
基于色谱峰形状相关性和生物重复样本间的强度模式进行数据驱动聚类,可可靠地识别衍生峰。使用从杜氏利什曼原虫提取物获得的测试数据集,我们将峰数量减少了60%。经过质量控制过滤后,几乎80%的峰可通过数据库匹配进行推定鉴定。
自动峰过滤显著加快了数据解释过程。