Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver Campus, 2036 Main Mall, Vancouver, V6T 1Z1 British Columbia, Canada.
School of Civil and Environmental Engineering, Nanyang Technological University, Singapore 639798, Singapore.
J Am Soc Mass Spectrom. 2021 Sep 1;32(9):2296-2305. doi: 10.1021/jasms.0c00478. Epub 2021 Mar 19.
Tandem mass spectral (MS/MS) data in liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis are often contaminated as the selection of precursor ions is based on a low-resolution quadrupole mass filter. In this work, we developed a strategy to differentiate contamination fragment ions (CFIs) from true fragment ions (TFIs) in an MS/MS spectrum. The rationale is that TFIs should coelute with their parent ions, but CFIs should not. To assess coelution, we performed a parallel LC-MS/MS analysis in data-independent acquisition (DIA) with all-ion-fragmentation (AIF) mode. Using the DIA (AIF) data, peak-peak correlation (PPC) score is calculated between the extracted ion chromatogram (EIC) of the fragment ion using the MS/MS scans and the EIC of the precursor ion using the MS1 scans. A high PPC score is an indication of TFIs, and a low PPC score is an indication of CFIs. Tested using metabolomics data generated by high resolution QTOF and Orbitrap MS from various vendors in different LC-MS configurations, we found that more than 70% of the fragment ions have PPC scores < 0.8 and identified three common sources of CFIs, including (1) solvent contamination, (2) adjacent chemical contamination, and (3) undetermined signals from artifacts and noise. Combining PPC scores with other precursor and fragment ion information, we further developed a machine learning model that can robustly and conservatively predict CFIs. Incorporating the machine learning model, we created an R program, MS2Purifier, to automatically recognize CFIs and clean MS/MS spectra of metabolic features in LC-MS/MS data with high sensitivity and specificity.
串联质谱(MS/MS)数据在液相色谱-串联质谱(LC-MS/MS)分析中经常受到污染,因为前体离子的选择基于低分辨率四极杆质量滤波器。在这项工作中,我们开发了一种策略,以区分 MS/MS 光谱中的污染碎片离子(CFIs)和真实碎片离子(TFIs)。其原理是 TFIs 应该与它们的母离子共洗脱,但 CFIs 不应该。为了评估共洗脱,我们使用全离子碎裂(AIF)模式在数据非依赖性采集(DIA)中进行平行 LC-MS/MS 分析。使用 DIA(AIF)数据,通过 MS/MS 扫描提取碎片离子的提取离子色谱(EIC)和通过 MS1 扫描提取母离子的 EIC 之间计算峰峰相关(PPC)得分。高 PPC 得分表明是 TFIs,低 PPC 得分表明是 CFIs。通过使用来自不同供应商的高分辨率 QTOF 和轨道阱 MS 在不同 LC-MS 配置下生成的代谢组学数据进行测试,我们发现超过 70%的碎片离子的 PPC 得分<0.8,并确定了 CFIs 的三个常见来源,包括(1)溶剂污染,(2)相邻化学污染,和(3)来自伪影和噪声的未确定信号。将 PPC 得分与其他前体和碎片离子信息结合使用,我们进一步开发了一种机器学习模型,可以稳健且保守地预测 CFIs。结合机器学习模型,我们创建了一个 R 程序 MS2Purifier,可自动识别 CFIs,并以高灵敏度和特异性清洁 LC-MS/MS 数据中代谢物特征的 MS/MS 光谱。