计算机辅助代谢物质谱识别的从头碎片法。

In silico fragmentation for computer assisted identification of metabolite mass spectra.

机构信息

Leibniz Institute of Plant Biochemistry- Department of Stress- and Developmental Biology, Weinberg 3, 06120 Halle(Saale), Germany.

出版信息

BMC Bioinformatics. 2010 Mar 22;11:148. doi: 10.1186/1471-2105-11-148.

DOI:10.1186/1471-2105-11-148

PMID:20307295

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2853470/

Abstract

BACKGROUND

Mass spectrometry has become the analytical method of choice in metabolomics research. The identification of unknown compounds is the main bottleneck. In addition to the precursor mass, tandem MS spectra carry informative fragment peaks, but the coverage of spectral libraries of measured reference compounds are far from covering the complete chemical space. Compound libraries such as PubChem or KEGG describe a larger number of compounds, which can be used to compare their in silico fragmentation with spectra of unknown metabolites.

RESULTS

We created the MetFrag suite to obtain a candidate list from compound libraries based on the precursor mass, subsequently ranked by the agreement between measured and in silico fragments. In the evaluation MetFrag was able to rank most of the correct compounds within the top 3 candidates returned by an exact mass query in KEGG. Compared to a previously published study, MetFrag obtained better results than the commercial MassFrontier software. Especially for large compound libraries, the candidates with a good score show a high structural similarity or just different stereochemistry, a subsequent clustering based on chemical distances reduces this redundancy. The in silico fragmentation requires less than a second to process a molecule, and MetFrag performs a search in KEGG or PubChem on average within 30 to 300 seconds, respectively, on an average desktop PC.

CONCLUSIONS

We presented a method that is able to identify small molecules from tandem MS measurements, even without spectral reference data or a large set of fragmentation rules. With today's massive general purpose compound libraries we obtain dozens of very similar candidates, which still allows a confident estimate of the correct compound class. Our tool MetFrag improves the identification of unknown substances from tandem MS spectra and delivers better results than comparable commercial software. MetFrag is available through a web application, web services and as java library. The web frontend allows the end-user to analyse single spectra and browse the results, whereas the web service and console application are aimed to perform batch searches and evaluation.

摘要

背景

质谱分析已成为代谢组学研究中的首选分析方法。未知化合物的鉴定是主要的瓶颈。除了母离子质量之外，串联质谱谱图还带有信息量丰富的碎片峰，但测量参考化合物的谱库覆盖率远未涵盖完整的化学空间。化合物库，如 PubChem 或 KEGG，描述了更多的化合物，可以将它们的计算机模拟碎片与未知代谢物的光谱进行比较。

结果

我们创建了 MetFrag 套件，以便根据母离子质量从化合物库中获得候选列表，然后根据测量和计算机模拟碎片之间的一致性对候选列表进行排序。在评估中，MetFrag 能够在 KEGG 中通过精确质量查询返回的前 3 个候选物中排名大多数正确的化合物。与之前的研究相比，MetFrag 获得的结果优于商业软件 MassFrontier。特别是对于大型化合物库，得分较高的候选物具有较高的结构相似性或只是不同的立体化学，随后基于化学距离进行聚类可以减少这种冗余。计算机模拟碎片的处理时间不到一秒，MetFrag 在平均桌面 PC 上分别在 KEGG 或 PubChem 上进行搜索平均需要 30 到 300 秒。

结论

我们提出了一种能够从串联质谱测量中识别小分子的方法，即使没有光谱参考数据或大量的碎片规则集。使用当今海量的通用化合物库，我们得到了几十个非常相似的候选物，这仍然可以对正确的化合物类进行有信心的估计。我们的工具 MetFrag 提高了从串联质谱谱图中识别未知物质的能力，并提供了比可比商业软件更好的结果。MetFrag 可通过 Web 应用程序、Web 服务和 Java 库获得。Web 前端允许最终用户分析单个光谱并浏览结果，而 Web 服务和控制台应用程序旨在执行批量搜索和评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/444a/2853470/cf5f301b6646/1471-2105-11-148-1.jpg

相似文献

In silico fragmentation for computer assisted identification of metabolite mass spectra.

BMC Bioinformatics. 2010 Mar 22;11:148. doi: 10.1186/1471-2105-11-148.

MetFusion: integration of compound identification strategies.

J Mass Spectrom. 2013 Mar;48(3):291-8. doi: 10.1002/jms.3123.

MetFrag relaunched: incorporating strategies beyond in silico fragmentation.

J Cheminform. 2016 Jan 29;8:3. doi: 10.1186/s13321-016-0115-9. eCollection 2016.

Annotation of metabolites from gas chromatography/atmospheric pressure chemical ionization tandem mass spectrometry data using an in silico generated compound database and MetFrag.

Rapid Commun Mass Spectrom. 2015 Aug 30;29(16):1521-9. doi: 10.1002/rcm.7244.

Evaluation of an Artificial Neural Network Retention Index Model for Chemical Structure Identification in Nontargeted Metabolomics.

Anal Chem. 2018 Nov 6;90(21):12752-12760. doi: 10.1021/acs.analchem.8b03118. Epub 2018 Oct 24.

Database supported candidate search for metabolite identification.

J Integr Bioinform. 2011 Jul 7;8(2):157. doi: 10.2390/biecoll-jib-2011-157.

Metabolomic spectral libraries for data-independent SWATH liquid chromatography mass spectrometry acquisition.

Anal Bioanal Chem. 2018 Mar;410(7):1873-1884. doi: 10.1007/s00216-018-0860-x. Epub 2018 Feb 6.

MetFID: artificial neural network-based compound fingerprint prediction for metabolite annotation.

Metabolomics. 2020 Sep 30;16(10):104. doi: 10.1007/s11306-020-01726-7.

The Compound Characteristics Comparison (CCC) approach: a tool for improving confidence in natural compound identification.

Food Addit Contam Part A Chem Anal Control Expo Risk Assess. 2018 Nov;35(11):2145-2157. doi: 10.1080/19440049.2018.1523572. Epub 2018 Oct 23.

MassGenie: A Transformer-Based Deep Learning Method for Identifying Small Molecules from Their Mass Spectra.

Biomolecules. 2021 Nov 30;11(12):1793. doi: 10.3390/biom11121793.

引用本文的文献

Non-Targeted Analysis (NTA) of Plasma and Liver from Sprague Dawley Rats Exposed to Perfluorohexanesulfonamide (PFHxSA), a Precursor to Perfluorohexane Sulfonic Acid (PFHxS).

Toxics. 2025 Jun 21;13(7):523. doi: 10.3390/toxics13070523.

mineMS2: annotation of spectral libraries with exact fragmentation patterns.

J Cheminform. 2025 Jul 24;17(1):111. doi: 10.1186/s13321-025-01051-y.

Confrontations of the Pathogenic Fungus Colletotrichum graminicola With a Biocontrol Bacterium or a Ubiquitous Fungus Trigger Synthesis of Secondary Metabolites With Lead Structures of Synthetic Fungicides.

Environ Microbiol. 2025 Jul;27(7):e70145. doi: 10.1111/1462-2920.70145.

Development and Validation of a LC-QTOF-MS/MS Method to Assess the Phenolic Profile of Pulse Flours.

Molecules. 2025 Jun 25;30(13):2730. doi: 10.3390/molecules30132730.

JESTR: Joint Embedding Space Technique for Ranking candidate molecules for the annotation of untargeted metabolomics data.

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf354.

Neural Spectral Prediction for Structure Elucidation with Tandem Mass Spectrometry.

bioRxiv. 2025 Jun 1:2025.05.28.656653. doi: 10.1101/2025.05.28.656653.

Isotope tracing-based metabolite identification for mass spectrometry metabolomics.

bioRxiv. 2025 Apr 8:2025.04.07.647691. doi: 10.1101/2025.04.07.647691.

A Novel Liquid Chromatographic Time-of-Flight Tandem Mass Spectrometric Method for the Determination of Secondary Metabolites in Functional Flours Produced from Grape Seed and Olive Stone Waste.

Molecules. 2025 Mar 29;30(7):1527. doi: 10.3390/molecules30071527.

Mining microbial and metabolic dark matter in extreme environments: a roadmap for harnessing the power of multi-omics data.

Adv Biotechnol (Singap). 2024 Aug 5;2(3):26. doi: 10.1007/s44307-024-00034-8.

Introducing "Identification Probability" for Automated and Transferable Assessment of Metabolite Identification Confidence in Metabolomics and Related Studies.

Anal Chem. 2025 Jan 14;97(1):1-11. doi: 10.1021/acs.analchem.4c04060. Epub 2024 Dec 19.

本文引用的文献

Optimization and testing of mass spectral library search algorithms for compound identification.

J Am Soc Mass Spectrom. 1994 Sep;5(9):859-66. doi: 10.1016/1044-0305(94)87009-8.

Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).

Metabolomics. 2007 Sep;3(3):211-221. doi: 10.1007/s11306-007-0082-2.

FiD: a software for ab initio structural identification of product ions from tandem mass spectrometric data.

Rapid Commun Mass Spectrom. 2008 Oct;22(19):3043-52. doi: 10.1002/rcm.3701.

Mass spectral metabonomics beyond elemental formula: chemical database querying by matching experimental with computational fragmentation spectra.

Anal Chem. 2008 Jul 15;80(14):5574-82. doi: 10.1021/ac800548g. Epub 2008 Jun 12.

Current trends and future requirements for the mass spectrometric investigation of microbial, mammalian and plant metabolomes.

Phys Biol. 2008 Feb 20;5(1):011001. doi: 10.1088/1478-3975/5/1/011001.

METLIN: a metabolite mass spectral database.

Ther Drug Monit. 2005 Dec;27(6):747-51. doi: 10.1097/01.ftd.0000179845.53213.39.

GMD@CSB.DB: the Golm Metabolome Database.

Bioinformatics. 2005 Apr 15;21(8):1635-8. doi: 10.1093/bioinformatics/bti236. Epub 2004 Dec 21.

The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics.

J Chem Inf Comput Sci. 2003 Mar-Apr;43(2):493-500. doi: 10.1021/ci025584y.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

计算机辅助代谢物质谱识别的从头碎片法。

In silico fragmentation for computer assisted identification of metabolite mass spectra.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献