• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PeakBot:基于机器学习的色谱峰提取。

PeakBot: machine-learning-based chromatographic peak picking.

机构信息

Department of Analytical Chemistry, University of Vienna, A-1090 Vienna, Austria.

Department of Agrobiotechnology IFA-Tulln, Institute of Bioanalytics and Agro-Metabolomics, University of Natural Resources and Life Sciences, Vienna, A-3430 Tulln, Austria.

出版信息

Bioinformatics. 2022 Jun 27;38(13):3422-3428. doi: 10.1093/bioinformatics/btac344.

DOI:10.1093/bioinformatics/btac344
PMID:35604083
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9237678/
Abstract

MOTIVATION

Chromatographic peak picking is among the first steps in data processing workflows of raw LC-HRMS datasets in untargeted metabolomics applications. Its performance is crucial for the holistic detection of all metabolic features as well as their relative quantification for statistical analysis and metabolite identification. Random noise, non-baseline separated compounds and unspecific background signals complicate this task.

RESULTS

A machine-learning-based approach entitled PeakBot was developed for detecting chromatographic peaks in LC-HRMS profile-mode data. It first detects all local signal maxima in a chromatogram, which are then extracted as super-sampled standardized areas (retention-time versus m/z). These are subsequently inspected by a custom-trained convolutional neural network that forms the basis of PeakBot's architecture. The model reports if the respective local maximum is the apex of a chromatographic peak or not as well as its peak center and bounding box. In training and independent validation datasets used for development, PeakBot achieved a high performance with respect to discriminating between chromatographic peaks and background signals (accuracy of 0.99). For training the machine-learning model a minimum of 100 reference features are needed to learn their characteristics to achieve high-quality peak-picking results for detecting such chromatographic peaks in an untargeted fashion. PeakBot is implemented in python (3.8) and uses the TensorFlow (2.5.0) package for machine-learning related tasks. It has been tested on Linux and Windows OSs.

AVAILABILITY AND IMPLEMENTATION

The package is available free of charge for non-commercial use (CC BY-NC-SA). It is available at https://github.com/christophuv/PeakBot.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在非靶向代谢组学应用中,色谱峰提取是原始 LC-HRMS 数据数据处理工作流程的第一步。其性能对于整体检测所有代谢特征及其相对定量进行统计分析和代谢物鉴定至关重要。随机噪声、未基线分离的化合物和非特异性背景信号使这项任务变得复杂。

结果

我们开发了一种基于机器学习的方法,名为 PeakBot,用于检测 LC-HRMS 谱图模式数据中的色谱峰。它首先检测色谱图中的所有局部信号极大值,然后将其提取为超采样标准化区域(保留时间与 m/z)。随后,由一个定制训练的卷积神经网络对其进行检查,该网络构成了 PeakBot 架构的基础。该模型报告各局部最大值是否为色谱峰的顶点以及其峰中心和边界框。在用于开发的训练和独立验证数据集中,PeakBot 在区分色谱峰和背景信号方面表现出很高的性能(准确率为 0.99)。为了训练机器学习模型,需要至少 100 个参考特征来学习它们的特征,以便以非靶向方式检测到这些色谱峰并获得高质量的峰提取结果。PeakBot 是用 python(3.8)编写的,并使用 TensorFlow(2.5.0)包进行与机器学习相关的任务。它已经在 Linux 和 Windows 操作系统上进行了测试。

可用性和实现

该软件包可供非商业用途免费使用(CC BY-NC-SA)。可在 https://github.com/christophuv/PeakBot 上获取。

补充信息

补充数据可在生物信息学在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/11066fddacc3/btac344f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/b72bf6e2e86a/btac344f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/8767eb2a068a/btac344f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/8a008fbf10bd/btac344f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/11066fddacc3/btac344f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/b72bf6e2e86a/btac344f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/8767eb2a068a/btac344f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/8a008fbf10bd/btac344f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e23/9237678/11066fddacc3/btac344f4.jpg

相似文献

1
PeakBot: machine-learning-based chromatographic peak picking.PeakBot:基于机器学习的色谱峰提取。
Bioinformatics. 2022 Jun 27;38(13):3422-3428. doi: 10.1093/bioinformatics/btac344.
2
IDSL.IPA Characterizes the Organic Chemical Space in Untargeted LC/HRMS Data Sets.IDSL.IPA 描绘了非靶向 LC/HRMS 数据集的有机化学空间。
J Proteome Res. 2022 Jun 3;21(6):1485-1494. doi: 10.1021/acs.jproteome.2c00120. Epub 2022 May 17.
3
Comparison of peak-picking workflows for untargeted liquid chromatography/high-resolution mass spectrometry metabolomics data analysis.非靶向液相色谱/高分辨率质谱代谢组学数据分析中峰挑选工作流程的比较
Rapid Commun Mass Spectrom. 2015 Jan 15;29(1):119-27. doi: 10.1002/rcm.7094.
4
IPO: a tool for automated optimization of XCMS parameters.IPO:一种用于自动优化XCMS参数的工具。
BMC Bioinformatics. 2015 Apr 16;16:118. doi: 10.1186/s12859-015-0562-8.
5
MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC-MS metabolomics data.MetaClean:一种基于机器学习的分类器,用于降低非靶向 LC-MS 代谢组学数据中假阳性峰的检测率。
Metabolomics. 2020 Oct 21;16(11):117. doi: 10.1007/s11306-020-01738-3.
6
Automated optimization of XCMS parameters for improved peak picking of liquid chromatography-mass spectrometry data using the coefficient of variation and parameter sweeping for untargeted metabolomics.使用变异系数和参数扫描对液相色谱-质谱联用数据进行无靶标代谢组学分析时,自动优化 XCMS 参数以提高峰提取效率。
Drug Test Anal. 2019 Jun;11(6):752-761. doi: 10.1002/dta.2552. Epub 2018 Dec 25.
7
Quality evaluation of extracted ion chromatograms and chromatographic peaks in liquid chromatography/mass spectrometry-based metabolomics data.基于液相色谱/质谱的代谢组学数据中提取离子色谱图和色谱峰的质量评估
BMC Bioinformatics. 2014;15 Suppl 11(Suppl 11):S5. doi: 10.1186/1471-2105-15-S11-S5. Epub 2014 Oct 21.
8
peakPantheR, an R package for large-scale targeted extraction and integration of annotated metabolic features in LC-MS profiling datasets.peakPantheR,一个用于大规模靶向提取和整合 LC-MS 分析数据集注释代谢特征的 R 包。
Bioinformatics. 2021 Dec 11;37(24):4886-4888. doi: 10.1093/bioinformatics/btab433.
9
Improving peak detection in high-resolution LC/MS metabolomics data using preexisting knowledge and machine learning approach.利用已有知识和机器学习方法提高高分辨率 LC/MS 代谢组学数据中的峰检测。
Bioinformatics. 2014 Oct 15;30(20):2941-8. doi: 10.1093/bioinformatics/btu430. Epub 2014 Jul 7.
10
CPVA: a web-based metabolomic tool for chromatographic peak visualization and annotation.CPVA:一种基于网络的代谢组学工具,用于色谱峰可视化和注释。
Bioinformatics. 2020 Jun 1;36(12):3913-3915. doi: 10.1093/bioinformatics/btaa200.

引用本文的文献

1
Automated Integration and Quality Assessment of Chromatographic Peaks in LC-MS-Based Metabolomics and Lipidomics Using TARDIS.使用TARDIS对基于液相色谱-质谱联用的代谢组学和脂质组学中的色谱峰进行自动整合和质量评估。
Anal Chem. 2025 May 13;97(18):9927-9934. doi: 10.1021/acs.analchem.5c00567. Epub 2025 Apr 28.
2
What is the real value of omics data? Enhancing research outcomes and securing long-term data excellence.组学数据的真正价值是什么?提升研究成果,确保数据长期卓越。
Nucleic Acids Res. 2024 Nov 11;52(20):12130-12140. doi: 10.1093/nar/gkae901.
3
Tailored Mass Spectral Data Exploration Using the SpecXplore Interactive Dashboard.

本文引用的文献

1
Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics.深度学习辅助的大规模 LC-MS 代谢组学峰提取。
Anal Chem. 2022 Mar 29;94(12):4930-4937. doi: 10.1021/acs.analchem.1c02220. Epub 2022 Mar 15.
2
Five Easy Metrics of Data Quality for LC-MS-Based Global Metabolomics.基于 LC-MS 的全局代谢组学数据质量的五个简单指标。
Anal Chem. 2020 Oct 6;92(19):12925-12933. doi: 10.1021/acs.analchem.0c01493. Epub 2020 Sep 14.
3
A lipidome atlas in MS-DIAL 4.MS-DIAL 4 中的脂质组图谱
使用SpecXplore交互式仪表板进行定制质谱数据探索。
Anal Chem. 2024 Apr 16;96(15):5798-5806. doi: 10.1021/acs.analchem.3c04444. Epub 2024 Apr 2.
4
Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.用于下一代植物源生物制药的人工智能驱动的系统工程。
Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023.
5
MetaboLights: open data repository for metabolomics.MetaboLights:代谢组学开放数据知识库。
Nucleic Acids Res. 2024 Jan 5;52(D1):D640-D646. doi: 10.1093/nar/gkad1045.
6
Software and Computational Tools for LC-MS-Based Epilipidomics: Challenges and Solutions.基于液相色谱-质谱联用的表观脂质组学的软件和计算工具:挑战与解决方案
Anal Chem. 2023 Jan 10;95(1):287-303. doi: 10.1021/acs.analchem.2c04406.
Nat Biotechnol. 2020 Oct;38(10):1159-1163. doi: 10.1038/s41587-020-0531-2. Epub 2020 Jun 15.
4
Stable Isotope-Assisted Metabolomics for Deciphering Xenobiotic Metabolism in Mammalian Cell Culture.稳定同位素辅助代谢组学解析哺乳动物细胞培养中的外源性化合物代谢。
ACS Chem Biol. 2020 Apr 17;15(4):970-981. doi: 10.1021/acschembio.9b01016. Epub 2020 Mar 25.
5
Software tools, databases and resources in metabolomics: updates from 2018 to 2019.代谢组学中的软件工具、数据库和资源:2018 年至 2019 年的更新。
Metabolomics. 2020 Mar 7;16(3):36. doi: 10.1007/s11306-020-01657-3.
6
Analytical techniques for metabolomic studies: a review.代谢组学研究的分析技术:综述
Bioanalysis. 2019 Dec;11(24):2297-2318. doi: 10.4155/bio-2019-0014.
7
Deep Learning for the Precise Peak Detection in High-Resolution LC-MS Data.深度学习在高分辨 LC-MS 数据中的精确峰检测中的应用。
Anal Chem. 2020 Jan 7;92(1):588-592. doi: 10.1021/acs.analchem.9b04811. Epub 2019 Dec 23.
8
Mining for natural product antileishmanials in a fungal extract library.从真菌提取物文库中寻找天然产物抗利什曼原虫药物。
Int J Parasitol Drugs Drug Resist. 2019 Dec;11:118-128. doi: 10.1016/j.ijpddr.2019.05.003. Epub 2019 Jun 11.
9
Application of metabolomics and molecular networking in investigating the chemical profile and antitrypanosomal activity of British bluebells (Hyacinthoides non-scripta).代谢组学和分子网络在研究英国蓝铃花(Hyacinthoides non-scripta)化学成分和抗锥虫活性中的应用。
Sci Rep. 2019 Feb 22;9(1):2547. doi: 10.1038/s41598-019-38940-w.
10
Dark matter in host-microbiome metabolomics: Tackling the unknowns-A review.宿主-微生物组代谢组学中的暗物质:应对未知——综述。
Anal Chim Acta. 2018 Dec 11;1037:13-27. doi: 10.1016/j.aca.2017.12.034. Epub 2017 Dec 30.