Suppr超能文献

使用MZedDB在精确质量代谢组学数据中进行代谢物信号识别,MZedDB是一种利用预测电离行为“规则”的交互式质荷比注释工具。

Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules'.

作者信息

Draper John, Enot David P, Parker David, Beckmann Manfred, Snowdon Stuart, Lin Wanchang, Zubair Hassan

机构信息

Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth SY23 3DA, UK.

出版信息

BMC Bioinformatics. 2009 Jul 21;10:227. doi: 10.1186/1471-2105-10-227.

Abstract

BACKGROUND

Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of < 5 ppm (parts per million) thus providing potentially a direct method for signal putative annotation using databases containing metabolite mass information. Most database interfaces support only simple queries with the default assumption that molecules either gain or lose a single proton when ionised. In reality the annotation process is confounded by the fact that many ionisation products will be not only molecular isotopes but also salt/solvent adducts and neutral loss fragments of original metabolites. This report describes an annotation strategy that will allow searching based on all potential ionisation products predicted to form during electrospray ionisation (ESI).

RESULTS

Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50%) of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data.

CONCLUSION

We conclude that although ultra-high accurate mass instruments provide major insight into the chemical diversity of biological extracts, the facile annotation of a large proportion of signals is not possible by simple, automated query of current databases using computed molecular formulae. Parameterising MZedDB to take into account predicted ionisation behaviour and the biological source of any sample improves greatly both the frequency and accuracy of potential annotation 'hits' in ESI-MS data.

摘要

背景

使用质谱(MS)技术的代谢组学实验测量复杂生物样品粗提物中离子化分子的质荷比(m/z)和强度,以生成高维代谢物“指纹”或代谢物“轮廓”数据。高分辨率质谱仪通常能实现<5 ppm(百万分之一)的质量精度,从而为使用包含代谢物质量信息的数据库进行信号推定注释提供了一种潜在的直接方法。大多数数据库接口仅支持简单查询,默认假设分子在离子化时要么获得一个质子,要么失去一个质子。实际上,注释过程因以下事实而变得复杂:许多离子化产物不仅是分子同位素,还包括盐/溶剂加合物以及原始代谢物的中性丢失片段。本报告描述了一种注释策略,该策略将允许基于电喷雾电离(ESI)过程中预测形成的所有潜在离子化产物进行搜索。

结果

从公开可访问数据库中获取的代谢物“结构”被转换为通用格式,以在MZedDB中生成一个综合存档。“规则”源自化学信息,这些信息使MZedDB能够为每个结构生成一份可能形成的加合物和中性丢失片段列表,并即时计算每个潜在离子化产物的精确分子量,为基于精确质量的注释搜索提供目标。我们证明,代表不同生物基质产生的离子化产物群体的数据矩阵包含很大比例(有时>50%)的分子同位素、盐加合物和中性丢失片段。ESI-MS数据特征的相关性分析证实了m/z信号的预测关系。MZedDB中的集成同位素枚举器允许验证精确的同位素模式分布,以证实实验数据。

结论

我们得出结论,尽管超高精度质谱仪能让我们深入了解生物提取物的化学多样性,但使用计算出的分子式对当前数据库进行简单、自动化查询,无法轻松注释大部分信号。对MZedDB进行参数设置,以考虑预测的离子化行为和任何样品的生物来源,可大大提高ESI-MS数据中潜在注释“命中”的频率和准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验