Blaženović Ivana, Kind Tobias, Torbašinović Hrvoje, Obrenović Slobodan, Mehta Sajjan S, Tsugawa Hiroshi, Wermuth Tobias, Schauer Nicolas, Jahn Martina, Biedendieck Rebekka, Jahn Dieter, Fiehn Oliver
Technische Universität Braunschweig - Institute of Microbiology, Brunswick, Germany.
Metabolomic Discoveries GmbH, Potsdam, Germany.
J Cheminform. 2017 May 25;9(1):32. doi: 10.1186/s13321-017-0219-x.
In mass spectrometry-based untargeted metabolomics, rarely more than 30% of the compounds are identified. Without the true identity of these molecules it is impossible to draw conclusions about the biological mechanisms, pathway relationships and provenance of compounds. The only way at present to address this discrepancy is to use in silico fragmentation software to identify unknown compounds by comparing and ranking theoretical MS/MS fragmentations from target structures to experimental tandem mass spectra (MS/MS). We compared the performance of four publicly available in silico fragmentation algorithms (MetFragCL, CFM-ID, MAGMa+ and MS-FINDER) that participated in the 2016 CASMI challenge. We found that optimizing the use of metadata, weighting factors and the manner of combining different tools eventually defined the ultimate outcomes of each method. We comprehensively analysed how outcomes of different tools could be combined and reached a final success rate of 93% for the training data, and 87% for the challenge data, using a combination of MAGMa+, CFM-ID and compound importance information along with MS/MS matching. Matching MS/MS spectra against the MS/MS libraries without using any in silico tool yielded 60% correct hits, showing that the use of in silico methods is still important.
在基于质谱的非靶向代谢组学中,能够被鉴定出来的化合物很少超过30%。如果不知道这些分子的真实身份,就不可能对化合物的生物学机制、途径关系和来源得出结论。目前解决这一差异的唯一方法是使用计算机辅助碎裂软件,通过将目标结构的理论二级质谱碎裂与实验串联质谱(MS/MS)进行比较和排序来鉴定未知化合物。我们比较了参加2016年CASMI挑战赛的四种公开可用的计算机辅助碎裂算法(MetFragCL、CFM-ID、MAGMa+和MS-FINDER)的性能。我们发现,优化元数据的使用、加权因子以及组合不同工具的方式最终决定了每种方法的最终结果。我们全面分析了如何组合不同工具的结果,使用MAGMa+、CFM-ID和化合物重要性信息以及MS/MS匹配,训练数据的最终成功率达到了93%,挑战数据的成功率为87%。在不使用任何计算机辅助工具的情况下,将MS/MS谱与MS/MS库进行匹配,正确匹配率为60%,这表明使用计算机辅助方法仍然很重要。