†Eawag, Swiss Federal Institute for Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
‡Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092 Zürich, Switzerland.
Anal Chem. 2015 Jun 2;87(11):5738-44. doi: 10.1021/acs.analchem.5b00941. Epub 2015 May 15.
A fast and memory-efficient calculation of theoretical isotope patterns is crucial for the routine interpretation of mass spectrometric data. For high-resolution experiments, calculations must procure the exact masses and probabilities of relevant isotopologues over a wide range of polyisotopic compounds, while pruning low-probable ones. Here, a novel albeit simple treelike structure is introduced to swiftly derive sets of relevant subisotopologues for each element in a molecule, which are then combined to the isotopologues of the full molecule. In contrast to existing approaches, transitions via single replacements of the most abundant isotope per element are used in separable tree branches to derive subisotopologues from each other. Moreover, the underlying transition trees prevent redundant replacements and permit the detection of the most probable isotopologue in a first phase. A relative threshold can then be exploited in a second parallelized phase for a precise prepruning of large fractions of the remaining subisotopologues. The gain in performance from such early pruning and the lower variation in the distortion of simulated data with use of relative rather than absolute thresholds were validated in a large-scale benchmark simulation, unprecedentedly comprising several thousand molecular formulas. Both the algorithm and a wealth of related features are freely available as R-package enviPat and as a user-friendly Web interface.
理论同位素模式的快速且节省内存的计算对于质谱数据的常规解释至关重要。对于高分辨率实验,计算必须在广泛的多同位素化合物范围内获取相关同量异位素的精确质量和概率,同时剔除低概率的同量异位素。在这里,引入了一种新颖的、简单的树状结构,用于快速推导出分子中每个元素的相关亚同位素组,然后将其组合成完整分子的同量异位素。与现有方法相比,通过每个元素中最丰富同位素的单取代来使用可分离的树分支,从彼此中推导出亚同位素组。此外,基础的转换树防止了冗余的替换,并允许在第一阶段检测最可能的同量异位素。然后,可以在第二个并行化阶段利用相对阈值来精确地预先剔除大量剩余的亚同位素组。在大规模基准模拟中验证了这种早期修剪的性能增益,以及使用相对而不是绝对阈值对模拟数据的失真变化更小,该模拟前所未有地包含了几千个分子式。该算法以及大量相关功能都可作为 R 包 enviPat 和用户友好的 Web 界面免费获得。