Kind Tobias, Fiehn Oliver
University of California Davis, Genome Center, Davis, CA 95616, USA.
BMC Bioinformatics. 2006 Apr 28;7:234. doi: 10.1186/1471-2105-7-234.
Metabolomic studies are targeted at identifying and quantifying all metabolites in a given biological context. Among the tools used for metabolomic research, mass spectrometry is one of the most powerful tools. However, metabolomics by mass spectrometry always reveals a high number of unknown compounds which complicate in depth mechanistic or biochemical understanding. In principle, mass spectrometry can be utilized within strategies of de novo structure elucidation of small molecules, starting with the computation of the elemental composition of an unknown metabolite using accurate masses with errors <5 ppm (parts per million). However even with very high mass accuracy (<1 ppm) many chemically possible formulae are obtained in higher mass regions. In automatic routines an additional orthogonal filter therefore needs to be applied in order to reduce the number of potential elemental compositions. This report demonstrates the necessity of isotope abundance information by mathematical confirmation of the concept.
High mass accuracy (<1 ppm) alone is not enough to exclude enough candidates with complex elemental compositions (C, H, N, S, O, P, and potentially F, Cl, Br and Si). Use of isotopic abundance patterns as a single further constraint removes >95% of false candidates. This orthogonal filter can condense several thousand candidates down to only a small number of molecular formulas. Example calculations for 10, 5, 3, 1 and 0.1 ppm mass accuracy are given. Corresponding software scripts can be downloaded from http://fiehnlab.ucdavis.edu. A comparison of eight chemical databases revealed that PubChem and the Dictionary of Natural Products can be recommended for automatic queries using molecular formulae.
More than 1.6 million molecular formulae in the range 0-500 Da were generated in an exhaustive manner under strict observation of mathematical and chemical rules. Assuming that ion species are fully resolved (either by chromatography or by high resolution mass spectrometry), we conclude that a mass spectrometer capable of 3 ppm mass accuracy and 2% error for isotopic abundance patterns outperforms mass spectrometers with less than 1 ppm mass accuracy or even hypothetical mass spectrometers with 0.1 ppm mass accuracy that do not include isotope information in the calculation of molecular formulae.
代谢组学研究旨在识别和定量给定生物背景下的所有代谢物。在用于代谢组学研究的工具中,质谱是最强大的工具之一。然而,基于质谱的代谢组学总会揭示大量未知化合物,这使得深入的机理或生化理解变得复杂。原则上,质谱可用于小分子的从头结构解析策略,从使用误差<5 ppm(百万分之一)的精确质量计算未知代谢物的元素组成开始。然而,即使具有非常高的质量精度(<1 ppm),在较高质量区域仍会得到许多化学上可能的分子式。因此,在自动程序中需要应用额外的正交滤波器,以减少潜在元素组成的数量。本报告通过对该概念的数学验证证明了同位素丰度信息的必要性。
仅靠高质量精度(<1 ppm)不足以排除足够多具有复杂元素组成(C、H、N、S、O、P,以及可能的F、Cl、Br和Si)的候选物。将同位素丰度模式用作单一的进一步约束可去除>95%的错误候选物。这种正交滤波器可将数千个候选物浓缩为仅少数几个分子式。给出了质量精度为10、5、3、1和0.1 ppm的示例计算。相应的软件脚本可从http://fiehnlab.ucdavis.edu下载。对八个化学数据库的比较表明,推荐使用PubChem和《天然产物词典》进行基于分子式的自动查询。
在严格遵守数学和化学规则的情况下,以详尽的方式生成了0至500 Da范围内超过160万个分子式。假设离子种类已完全分离(通过色谱法或高分辨率质谱法),我们得出结论,一台能够实现3 ppm质量精度和2%同位素丰度模式误差的质谱仪优于质量精度低于1 ppm的质谱仪,甚至优于在分子式计算中不包含同位素信息的假设的质量精度为0.1 ppm的质谱仪。