一种用于多个气相色谱-质谱实验中信号峰比对的动态规划方法。

A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography-mass spectrometry experiments.

作者信息

Robinson Mark D, De Souza David P, Keen Woon Wai, Saunders Eleanor C, McConville Malcolm J, Speed Terence P, Likić Vladimir A

机构信息

The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3050, Australia.

出版信息

BMC Bioinformatics. 2007 Oct 29;8:419. doi: 10.1186/1471-2105-8-419.

DOI:10.1186/1471-2105-8-419

PMID:17963529

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2194738/

Abstract

BACKGROUND

Gas chromatography-mass spectrometry (GC-MS) is a robust platform for the profiling of certain classes of small molecules in biological samples. When multiple samples are profiled, including replicates of the same sample and/or different sample states, one needs to account for retention time drifts between experiments. This can be achieved either by the alignment of chromatographic profiles prior to peak detection, or by matching signal peaks after they have been extracted from chromatogram data matrices. Automated retention time correction is particularly important in non-targeted profiling studies.

RESULTS

A new approach for matching signal peaks based on dynamic programming is presented. The proposed approach relies on both peak retention times and mass spectra. The alignment of more than two peak lists involves three steps: (1) all possible pairs of peak lists are aligned, and similarity of each pair of peak lists is estimated; (2) the guide tree is built based on the similarity between the peak lists; (3) peak lists are progressively aligned starting with the two most similar peak lists, following the guide tree until all peak lists are exhausted. When two or more experiments are performed on different sample states and each consisting of multiple replicates, peak lists within each set of replicate experiments are aligned first (within-state alignment), and subsequently the resulting alignments are aligned themselves (between-state alignment). When more than two sets of replicate experiments are present, the between-state alignment also employs the guide tree. We demonstrate the usefulness of this approach on GC-MS metabolic profiling experiments acquired on wild-type and mutant Leishmania mexicana parasites.

CONCLUSION

We propose a progressive method to match signal peaks across multiple GC-MS experiments based on dynamic programming. A sensitive peak similarity function is proposed to balance peak retention time and peak mass spectra similarities. This approach can produce the optimal alignment between an arbitrary number of peak lists, and models explicitly within-state and between-state peak alignment. The accuracy of the proposed method was close to the accuracy of manually-curated peak matching, which required tens of man-hours for the analyzed data sets. The proposed approach may offer significant advantages for processing of high-throughput metabolomics data, especially when large numbers of experimental replicates and multiple sample states are analyzed.

摘要

背景

气相色谱 - 质谱联用（GC-MS）是分析生物样品中某些小分子类别的强大平台。当对多个样品进行分析时，包括同一样品的重复样品和/或不同样品状态，需要考虑实验之间的保留时间漂移。这可以通过在峰检测之前对色谱图进行对齐来实现，或者通过从色谱图数据矩阵中提取信号峰之后进行匹配来实现。自动保留时间校正在非靶向分析研究中尤为重要。

结果

提出了一种基于动态规划的信号峰匹配新方法。该方法同时依赖于峰保留时间和质谱。两个以上峰列表的对齐涉及三个步骤：（1）对齐所有可能的峰列表对，并估计每对峰列表的相似度；（2）基于峰列表之间的相似度构建引导树；（3）从最相似的两个峰列表开始，按照引导树逐步对齐峰列表，直到所有峰列表都处理完毕。当对不同样品状态进行两个或更多实验，且每个实验包含多个重复样品时，先对每组重复实验中的峰列表进行对齐（组内对齐），然后将得到的对齐结果再进行对齐（组间对齐）。当存在两组以上的重复实验时，组间对齐也采用引导树。我们在野生型和突变型墨西哥利什曼原虫寄生虫的GC-MS代谢谱实验中证明了该方法的有效性。

结论

我们提出了一种基于动态规划的渐进式方法，用于在多个GC-MS实验中匹配信号峰。提出了一种灵敏的峰相似度函数，以平衡峰保留时间和峰质谱相似度。该方法可以在任意数量的峰列表之间产生最优对齐，并明确模拟组内和组间的峰对齐。所提方法的准确性接近人工精心策划的峰匹配的准确性，而人工匹配分析数据集需要数十个人工时。所提方法在处理高通量代谢组学数据时可能具有显著优势，特别是在分析大量实验重复样品和多个样品状态时。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b20/2194738/162dd99807b2/1471-2105-8-419-1.jpg

相似文献

A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography-mass spectrometry experiments.

BMC Bioinformatics. 2007 Oct 29;8:419. doi: 10.1186/1471-2105-8-419.

Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry.

BMC Bioinformatics. 2011 Jun 15;12:235. doi: 10.1186/1471-2105-12-235.

TagFinder for the quantitative analysis of gas chromatography--mass spectrometry (GC-MS)-based metabolite profiling experiments.

Bioinformatics. 2008 Mar 1;24(5):732-7. doi: 10.1093/bioinformatics/btn023. Epub 2008 Jan 19.

Progressive peak clustering in GC-MS Metabolomic experiments applied to Leishmania parasites.

Bioinformatics. 2006 Jun 1;22(11):1391-6. doi: 10.1093/bioinformatics/btl085. Epub 2006 Mar 9.

Global peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry using point matching algorithms.

J Bioinform Comput Biol. 2016 Dec;14(6):1650032. doi: 10.1142/S0219720016500323. Epub 2016 Sep 9.

Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets.

BMC Bioinformatics. 2012 Aug 27;13:214. doi: 10.1186/1471-2105-13-214.

An iterative block-shifting approach to retention time alignment that preserves the shape and area of gas chromatography-mass spectrometry peaks.

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S15. doi: 10.1186/1471-2105-9-S9-S15.

[Development of a widely-targeted metabolomics method based on gas chromatography-mass spectrometry].

Se Pu. 2023 Jun 8;41(6):520-526. doi: 10.3724/SP.J.1123.2022.10003.

An optimal peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry using mixture similarity measure.

Bioinformatics. 2011 Jun 15;27(12):1660-6. doi: 10.1093/bioinformatics/btr188. Epub 2011 Apr 14.

Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching.

Bioinformatics. 2006 Sep 1;22(17):2059-65. doi: 10.1093/bioinformatics/btl355. Epub 2006 Jul 4.

引用本文的文献

Continuous Wavelet Transform-Based Method for High-Sensitivity Detection of Image Signals of Fluorescence Lateral Flow Assay.

Sensors (Basel). 2025 Jun 20;25(13):3846. doi: 10.3390/s25133846.

DA_2DCHROM - a data alignment tool for applications on real GC × GC-TOF samples.

Anal Bioanal Chem. 2023 May;415(13):2641-2651. doi: 10.1007/s00216-023-04679-7. Epub 2023 Apr 10.

Metabolic flux analysis: a comprehensive review on sample preparation, analytical techniques, data analysis, computational modelling, and main application areas.

RSC Adv. 2022 Sep 7;12(39):25528-25548. doi: 10.1039/d2ra03326g. eCollection 2022 Sep 5.

Gas Chromatographic Fingerprint Analysis for the Comparison of Seized Cannabis Samples.

Molecules. 2021 Nov 2;26(21):6643. doi: 10.3390/molecules26216643.

Mining plant metabolomes: Methods, applications, and perspectives.

Plant Commun. 2021 Sep 4;2(5):100238. doi: 10.1016/j.xplc.2021.100238. eCollection 2021 Sep 13.

The metaRbolomics Toolbox in Bioconductor and beyond.

Metabolites. 2019 Sep 23;9(10):200. doi: 10.3390/metabo9100200.

DIAlignR Provides Precise Retention Time Alignment Across Distant Runs in DIA and Targeted Proteomics.

Mol Cell Proteomics. 2019 Apr;18(4):806-817. doi: 10.1074/mcp.TIR118.001132. Epub 2019 Jan 31.

Discovery and validation of potential urinary biomarkers for bladder cancer diagnosis using a pseudotargeted GC-MS metabolomics method.

Oncotarget. 2017 Mar 28;8(13):20719-20728. doi: 10.18632/oncotarget.14988.

Metabolic differences of industrial acarbose-producing Actinoplanes sp. A56 under various osmolality levels.

World J Microbiol Biotechnol. 2016 Jan;32(1):3. doi: 10.1007/s11274-015-1976-1. Epub 2015 Dec 28.

Interactive performances of betaine on the metabolic processes of Pseudomonas denitrificans.

J Ind Microbiol Biotechnol. 2015 Feb;42(2):273-8. doi: 10.1007/s10295-014-1562-9. Epub 2014 Dec 14.

本文引用的文献

Systematic identification of conserved metabolites in GC/MS data for metabolomics and biomarker discovery.

Anal Chem. 2007 Feb 1;79(3):966-73. doi: 10.1021/ac0614846.

MathDAMP: a package for differential analysis of metabolite profiles.

BMC Bioinformatics. 2006 Dec 13;7:530. doi: 10.1186/1471-2105-7-530.

MET-IDEA: data extraction tool for mass spectrometry-based metabolomics.

Anal Chem. 2006 Jul 1;78(13):4334-41. doi: 10.1021/ac0521596.

Progressive peak clustering in GC-MS Metabolomic experiments applied to Leishmania parasites.

Bioinformatics. 2006 Jun 1;22(11):1391-6. doi: 10.1093/bioinformatics/btl085. Epub 2006 Mar 9.

Microbial metabolomics with gas chromatography/mass spectrometry.

Anal Chem. 2006 Feb 15;78(4):1272-81. doi: 10.1021/ac051683+.

XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification.

Anal Chem. 2006 Feb 1;78(3):779-87. doi: 10.1021/ac051437y.

Metabolomic identification of novel biomarkers of myocardial ischemia.

Circulation. 2005 Dec 20;112(25):3868-75. doi: 10.1161/CIRCULATIONAHA.105.569137. Epub 2005 Dec 12.

Transformation and other factors of the peptide mass spectrometry pairwise peak-list comparison process.

BMC Bioinformatics. 2005 Nov 30;6:285. doi: 10.1186/1471-2105-6-285.

A novel approach for nontargeted data analysis for metabolomics. Large-scale profiling of tomato fruit volatiles.

Plant Physiol. 2005 Nov;139(3):1125-37. doi: 10.1104/pp.105.068130.

Signal maps for mass spectrometry-based comparative proteomics.

Mol Cell Proteomics. 2006 Mar;5(3):423-32. doi: 10.1074/mcp.M500133-MCP200. Epub 2005 Nov 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于多个气相色谱-质谱实验中信号峰比对的动态规划方法。

A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography-mass spectrometry experiments.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献