Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands.
Rapid Commun Mass Spectrom. 2012 Oct 30;26(20):2461-71. doi: 10.1002/rcm.6364.
High-resolution multistage MS(n) data contains detailed information that can be used for structural elucidation of compounds observed in metabolomics studies. However, full exploitation of this complex data requires significant analysis efforts by human experts. In silico methods currently used to support data annotation by assigning substructures of candidate molecules are limited to a single level of MS fragmentation.
We present an extended substructure-based approach which allows annotation of hierarchical spectral trees obtained from high-resolution multistage MS(n) experiments. The algorithm yields a hierarchical tree of substructures of a candidate molecule to explain the fragment peaks observed at consecutive levels of the multistage MS(n) spectral tree. A matching score is calculated that indicates how well the candidate structure can explain the observed hierarchical fragmentation pattern.
The method is applied to MS(n) spectral trees of a set of compounds representing important chemical classes in metabolomics. Based on the calculated score, the correct molecules were successfully prioritized among extensive sets of candidates structures retrieved from the PubChem database.
The results indicate that the inclusion of subsequent levels of fragmentation in the automatic annotation of MS(n) data improves the identification of the correct compounds. We show that, especially in the case of lower mass accuracy, this improvement is not only due to the inclusion of additional fragment ions in the analysis, but also to the specific hierarchical information present in the MS(n) spectral trees. This method may significantly reduce the time required by MS experts to analyze complex MS(n) data.
高分辨率多级 MS(n) 数据包含可用于代谢组学研究中观察到的化合物结构阐明的详细信息。然而,充分利用这些复杂的数据需要人类专家进行大量的分析工作。目前用于通过分配候选分子的亚结构来支持数据注释的计算方法仅限于 MS 碎片化的单个级别。
我们提出了一种扩展的基于亚结构的方法,该方法允许对从高分辨率多级 MS(n) 实验中获得的分层光谱树进行注释。该算法生成候选分子的亚结构分层树,以解释在多级 MS(n) 光谱树的连续级别上观察到的碎片峰。计算了一个匹配分数,该分数指示候选结构可以解释观察到的分层碎片化模式的程度。
该方法应用于代表代谢组学中重要化学类别的一组化合物的 MS(n) 光谱树。基于计算的分数,可以成功地在从 PubChem 数据库中检索到的广泛的候选结构集中对正确的分子进行优先级排序。
结果表明,在 MS(n) 数据的自动注释中包含后续的碎片化水平可以提高正确化合物的识别。我们表明,特别是在质量精度较低的情况下,这种改进不仅归因于分析中包含了额外的碎片离子,还归因于 MS(n) 光谱树中存在的特定分层信息。该方法可以大大减少 MS 专家分析复杂 MS(n) 数据所需的时间。