Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia.
Anal Chem. 2009 Dec 15;81(24):10106-15. doi: 10.1021/ac901476u.
The ultrahigh-resolution Fourier transform ion cyclotron resonance (FTICR) mass spectrum of natural organic matter (NOM) contains several thousand peaks with dozens of molecules matching the same nominal mass. Such a complexity poses a significant challenge for automatic data interpretation, in which the most difficult task is molecular formula assignment, especially in the case of heavy and/or multielement ions. In this study, a new universal algorithm for automatic treatment of FTICR mass spectra of NOM and humic substances based on total mass difference statistics (TMDS) has been developed and implemented. The algorithm enables a blind search for unknown building blocks (instead of a priori known ones) by revealing repetitive patterns present in spectra. In this respect, it differs from all previously developed approaches. This algorithm was implemented in designing FIRAN-software for fully automated analysis of mass data with high peak density. The specific feature of FIRAN is its ability to assign formulas to heavy and/or multielement molecules using "virtual elements" approach. To verify the approach, it was used for processing mass spectra of sodium polystyrene sulfonate (PSS, M(w) = 2200 Da) and polymethacrylate (PMA, M(w) = 3290 Da) which produce heavy multielement and multiply-charged ions. Application of TMDS identified unambiguously monomers present in the polymers consistent with their structure: C(8)H(7)SO(3)Na for PSS and C(4)H(6)O(2) for PMA. It also allowed unambiguous formula assignment to all multiply-charged peaks including the heaviest peak in PMA spectrum at mass 4025.6625 with charge state 6- (mass bias -0.33 ppm). Application of the TMDS-algorithm to processing data on the Suwannee River FA has proven its unique capacities in analysis of spectra with high peak density: it has not only identified the known small building blocks in the structure of FA such as CH(2), H(2), C(2)H(2)O, O but the heavier unit at 154.027 amu. The latter was identified for the first time and assigned a formula C(7)H(6)O(4) consistent with the structure of dihydroxyl-benzoic acids. The presence of these compounds in the structure of FA has so far been numerically suggested but never proven directly. It was concluded that application of the TMDS-algorithm opens new horizons in unfolding molecular complexity of NOM and other natural products.
天然有机物 (NOM) 的超高分辨率傅里叶变换离子回旋共振 (FTICR) 质谱包含数千个峰,其中数十个分子具有相同的名义质量。这种复杂性给自动数据解释带来了重大挑战,其中最困难的任务是分子式赋值,尤其是对于重元素和/或多电荷离子。在这项研究中,开发并实施了一种基于总质量差统计 (TMDS) 的新的通用算法,用于自动处理 NOM 和腐殖质的 FTICR 质谱。该算法通过揭示谱中存在的重复模式,实现对未知构建块(而不是先验已知构建块)的盲目搜索。在这方面,它与以前开发的所有方法都不同。该算法已在 FIRAN 软件中实现,用于全自动分析具有高密度峰的质谱数据。FIRAN 的一个特点是能够使用“虚拟元素”方法为重元素和/或多电荷分子分配公式。为了验证该方法,它被用于处理重元素和多电荷的聚苯乙烯磺酸钠 (PSS,M(w) = 2200 Da) 和聚甲基丙烯酸酯 (PMA,M(w) = 3290 Da) 的质谱,这些物质产生重元素多电荷和多电荷离子。TMDS 的应用明确识别了与聚合物结构一致的聚合物中的单体:PSS 为 C(8)H(7)SO(3)Na,PMA 为 C(4)H(6)O(2)。它还可以明确地为所有多电荷峰分配公式,包括 PMA 谱中最重的峰在质量 4025.6625 处的电荷状态为 6-(质量偏差-0.33 ppm)。TMDS 算法在对苏万尼河 FA 数据进行处理时,证明了其在分析高密度峰谱方面的独特能力:它不仅识别了 FA 结构中已知的小构建块,如 CH(2)、H(2)、C(2)H(2)O、O,还识别了 154.027 amu 的较重单元。后者是首次被识别,并被赋予了与二羟基苯甲酸结构一致的公式 C(7)H(6)O(4)。这些化合物在 FA 结构中的存在迄今为止仅通过数值建议,从未直接证明。研究结论认为,TMDS 算法的应用为揭示 NOM 和其他天然产物的分子复杂性开辟了新的前景。