The Advanced Research Institute of Intelligent Sensing Network, Tongji University, shanghai, 201804, China ; The Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai 201804, China ; School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China.
Comput Struct Biotechnol J. 2013 Jun 19;7:e201304002. doi: 10.5936/csbj.201304002. eCollection 2013.
Comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC/TOF-MS) has been applied to metabolomics analyses recently. However, retention time shifts in the two-dimensional gas chromatography will introduce difficulty to compare compound profiles obtained from multiple samples. In this work, a novel two-stage peak alignment algorithm has been developed for data analysis of GC×GC/TOF-MS. In the first stage, our algorithm detects and merges multiple peak entries of the same metabolite into one peak entry. After a z-score transformation of metabolite retention times, landmark peaks will be selected from all samples based on both two-dimensional retention times and mass spectrum similarity of fragment ions measured by Pearson's correlation coefficient. In the second stage, the original two-dimensional retention time shift will be corrected using a local linear fitting method. A progressive retention time map searching method is used to align peaks in all samples together based on the parameters optimized in the first stage. Our algorithm can avoid defining a threshold of retention time window and spectrum similarity, which is very difficult for the users since the experimental condition is always changed in different experimental runs, even for the repeat experiments. The experimental results show that our algorithm can work well in peak alignment from real biological samples, which is very important for the further analysis.
近年来,全面的二维气相色谱与飞行时间质谱联用(GC×GC/TOF-MS)已被应用于代谢组学分析。然而,二维气相色谱中的保留时间漂移会给比较多个样品获得的化合物图谱带来困难。在这项工作中,开发了一种新颖的两阶段峰对齐算法,用于 GC×GC/TOF-MS 的数据分析。在第一阶段,我们的算法检测并将同一种代谢物的多个峰条目合并为一个峰条目。在对代谢物保留时间进行 z 分数变换后,将基于二维保留时间和通过 Pearson 相关系数测量的碎片离子的质谱相似性,从所有样品中选择地标峰。在第二阶段,将使用局部线性拟合方法校正原始二维保留时间偏移。基于第一阶段优化的参数,使用渐进保留时间图搜索方法将所有样品中的峰一起对齐。我们的算法可以避免定义保留时间窗口和谱相似度的阈值,因为在不同的实验运行中,实验条件总是会发生变化,即使是重复实验也是如此,这对用户来说非常困难。实验结果表明,我们的算法可以很好地用于来自真实生物样本的峰对齐,这对于进一步的分析非常重要。