Leibniz Institute of Plant Biochemistry- Department of Stress- and Developmental Biology, Weinberg 3, 06120 Halle(Saale), Germany.
BMC Bioinformatics. 2010 Mar 22;11:148. doi: 10.1186/1471-2105-11-148.
Mass spectrometry has become the analytical method of choice in metabolomics research. The identification of unknown compounds is the main bottleneck. In addition to the precursor mass, tandem MS spectra carry informative fragment peaks, but the coverage of spectral libraries of measured reference compounds are far from covering the complete chemical space. Compound libraries such as PubChem or KEGG describe a larger number of compounds, which can be used to compare their in silico fragmentation with spectra of unknown metabolites.
We created the MetFrag suite to obtain a candidate list from compound libraries based on the precursor mass, subsequently ranked by the agreement between measured and in silico fragments. In the evaluation MetFrag was able to rank most of the correct compounds within the top 3 candidates returned by an exact mass query in KEGG. Compared to a previously published study, MetFrag obtained better results than the commercial MassFrontier software. Especially for large compound libraries, the candidates with a good score show a high structural similarity or just different stereochemistry, a subsequent clustering based on chemical distances reduces this redundancy. The in silico fragmentation requires less than a second to process a molecule, and MetFrag performs a search in KEGG or PubChem on average within 30 to 300 seconds, respectively, on an average desktop PC.
We presented a method that is able to identify small molecules from tandem MS measurements, even without spectral reference data or a large set of fragmentation rules. With today's massive general purpose compound libraries we obtain dozens of very similar candidates, which still allows a confident estimate of the correct compound class. Our tool MetFrag improves the identification of unknown substances from tandem MS spectra and delivers better results than comparable commercial software. MetFrag is available through a web application, web services and as java library. The web frontend allows the end-user to analyse single spectra and browse the results, whereas the web service and console application are aimed to perform batch searches and evaluation.
质谱分析已成为代谢组学研究中的首选分析方法。未知化合物的鉴定是主要的瓶颈。除了母离子质量之外,串联质谱谱图还带有信息量丰富的碎片峰,但测量参考化合物的谱库覆盖率远未涵盖完整的化学空间。化合物库,如 PubChem 或 KEGG,描述了更多的化合物,可以将它们的计算机模拟碎片与未知代谢物的光谱进行比较。
我们创建了 MetFrag 套件,以便根据母离子质量从化合物库中获得候选列表,然后根据测量和计算机模拟碎片之间的一致性对候选列表进行排序。在评估中,MetFrag 能够在 KEGG 中通过精确质量查询返回的前 3 个候选物中排名大多数正确的化合物。与之前的研究相比,MetFrag 获得的结果优于商业软件 MassFrontier。特别是对于大型化合物库,得分较高的候选物具有较高的结构相似性或只是不同的立体化学,随后基于化学距离进行聚类可以减少这种冗余。计算机模拟碎片的处理时间不到一秒,MetFrag 在平均桌面 PC 上分别在 KEGG 或 PubChem 上进行搜索平均需要 30 到 300 秒。
我们提出了一种能够从串联质谱测量中识别小分子的方法,即使没有光谱参考数据或大量的碎片规则集。使用当今海量的通用化合物库,我们得到了几十个非常相似的候选物,这仍然可以对正确的化合物类进行有信心的估计。我们的工具 MetFrag 提高了从串联质谱谱图中识别未知物质的能力,并提供了比可比商业软件更好的结果。MetFrag 可通过 Web 应用程序、Web 服务和 Java 库获得。Web 前端允许最终用户分析单个光谱并浏览结果,而 Web 服务和控制台应用程序旨在执行批量搜索和评估。