Wishart David S
Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada.
Bioanalysis. 2009 Dec;1(9):1579-96. doi: 10.4155/bio.09.138.
Most metabolomic data are characterized by complex spectra or chromatograms containing hundreds of peaks or features. While there are many methods for aligning or comparing these spectral features, there are few approaches for actually identifying which peaks match to which compounds. Indeed, one of the biggest unmet needs in the field of metabolomics lies in the problem of compound identification. This review describes some of the newly emerging computational strategies in metabolomics that are being used to aid in the identification of metabolites from biofluid mixtures analyzed by NMR and MS. The most successful compound-identification strategies typically involve matching spectral features of the unknown compound(s) to curated spectral databases of reference compounds. This approach is known as the identification of 'known unknowns'. However, the identification of truly novel compounds (the 'unknown unknowns') is particularly challenging and requires the use of computer-aided structure elucidation methods being applied to the purified compound. The strengths and limitations of these approaches as applied to different analytical technologies (GC-MS, LC-MS and NMR) will be discussed, as will prospects for potential improvements to existing strategies.
大多数代谢组学数据的特征是具有包含数百个峰或特征的复杂光谱或色谱图。虽然有许多方法可用于对齐或比较这些光谱特征,但实际确定哪些峰与哪些化合物匹配的方法却很少。事实上,代谢组学领域最大的未满足需求之一在于化合物鉴定问题。本综述描述了代谢组学中一些新出现的计算策略,这些策略正被用于辅助从通过核磁共振(NMR)和质谱(MS)分析的生物流体混合物中鉴定代谢物。最成功的化合物鉴定策略通常涉及将未知化合物的光谱特征与参考化合物的精选光谱数据库进行匹配。这种方法被称为“已知的未知物”的鉴定。然而,鉴定真正的新化合物(“未知的未知物”)特别具有挑战性,需要将计算机辅助结构解析方法应用于纯化的化合物。将讨论这些方法应用于不同分析技术(气相色谱 - 质谱联用(GC-MS)、液相色谱 - 质谱联用(LC-MS)和核磁共振(NMR))的优势和局限性,以及现有策略潜在改进的前景。