• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在高性能计算环境中,一种用于高效分析数千次 LC/MS 运行的并行化策略。

A Parallelization Strategy for the Time Efficient Analysis of Thousands of LC/MS Runs in High-Performance Computing Environment.

机构信息

Department of Pathology, Boston Children's Hospital, and Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, United States.

Department of Neuropsychology and Psychopharmacology, EURON, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht 6229ER, The Netherlands.

出版信息

J Proteome Res. 2022 Nov 4;21(11):2810-2814. doi: 10.1021/acs.jproteome.2c00278. Epub 2022 Oct 6.

DOI:10.1021/acs.jproteome.2c00278
PMID:36201825
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9930095/
Abstract

Combining robust proteomics instrumentation with high-throughput enabling liquid chromatography (LC) systems (e.g., timsTOF Pro and the Evosep One system, respectively) enabled mapping the proteomes of 1000s of samples. Fragpipe is one of the few computational protein identification and quantification frameworks that allows for the time-efficient analysis of such large data sets. However, it requires large amounts of computational power and data storage space that leave even state-of-the-art workstations underpowered when it comes to the analysis of proteomics data sets with 1000s of LC mass spectrometry runs. To address this issue, we developed and optimized a Fragpipe-based analysis strategy for a high-performance computing environment and analyzed 3348 plasma samples (6.4 TB) that were longitudinally collected from hospitalized COVID-19 patients under the auspice of the Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study. Our parallelization strategy reduced the total runtime by ∼90% from 116 (theoretical) days to just 9 days in the high-performance computing environment. All code is open-source and can be deployed in any Simple Linux Utility for Resource Management (SLURM) high-performance computing environment, enabling the analysis of large-scale high-throughput proteomics studies.

摘要

将强大的蛋白质组学仪器与高通量的液相色谱 (LC) 系统(例如 timsTOF Pro 和 Evosep One 系统)相结合,能够绘制数千个样本的蛋白质组图谱。Fragpipe 是少数几个允许对如此大规模数据集进行高效分析的计算蛋白质鉴定和定量框架之一。然而,它需要大量的计算能力和数据存储空间,即使是最先进的工作站,在分析具有数千个 LC 质谱运行的蛋白质组数据集时也显得力不从心。为了解决这个问题,我们开发并优化了一种基于 Fragpipe 的分析策略,用于高性能计算环境,并分析了 3348 个血浆样本(6.4 TB),这些样本是在 COVID-19 免疫表型评估队列(IMPACC)研究的主持下从住院 COVID-19 患者中纵向收集的。我们的并行化策略将总运行时间从理论上的 116 天减少到高性能计算环境中的 9 天,减少了约 90%。所有代码都是开源的,可以部署在任何 Simple Linux Utility for Resource Management (SLURM) 高性能计算环境中,从而能够分析大规模高通量蛋白质组学研究。

相似文献

1
A Parallelization Strategy for the Time Efficient Analysis of Thousands of LC/MS Runs in High-Performance Computing Environment.在高性能计算环境中,一种用于高效分析数千次 LC/MS 运行的并行化策略。
J Proteome Res. 2022 Nov 4;21(11):2810-2814. doi: 10.1021/acs.jproteome.2c00278. Epub 2022 Oct 6.
2
Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics.科拉:用于液相色谱-质谱联用发现和基于靶向质谱的蛋白质组学的计算框架及工具。
BMC Bioinformatics. 2008 Dec 16;9:542. doi: 10.1186/1471-2105-9-542.
3
dia-PASEF data analysis using FragPipe and DIA-NN for deep proteomics of low sample amounts.使用 FragPipe 和 DIA-NN 对低样本量进行深度蛋白质组学分析的 dia-PASEF 数据分析。
Nat Commun. 2022 Jul 8;13(1):3944. doi: 10.1038/s41467-022-31492-0.
4
Plasma Proteomics for Epidemiology: Increasing Throughput With Standard-Flow Rates.用于流行病学的血浆蛋白质组学:通过标准流速提高通量
Circ Cardiovasc Genet. 2017 Dec;10(6). doi: 10.1161/CIRCGENETICS.117.001808.
5
MZDASoft: a software architecture that enables large-scale comparison of protein expression levels over multiple samples based on liquid chromatography/tandem mass spectrometry.MZDASoft:一种软件架构,可基于液相色谱/串联质谱对多个样本的蛋白质表达水平进行大规模比较。
Rapid Commun Mass Spectrom. 2015 Oct 15;29(19):1841-8. doi: 10.1002/rcm.7272.
6
Zwitter-ionic monolith-based spintip column coupled with Evosep One liquid chromatography for high-throughput proteomic analysis.基于两性离子整体柱的自旋尖端柱与 Evosep One 液相色谱联用用于高通量蛋白质组学分析。
J Chromatogr A. 2022 Jul 19;1675:463122. doi: 10.1016/j.chroma.2022.463122. Epub 2022 May 13.
7
Robust, reproducible and quantitative analysis of thousands of proteomes by micro-flow LC-MS/MS.通过微流 LC-MS/MS 对数千种蛋白质组进行稳健、可重现和定量分析。
Nat Commun. 2020 Jan 9;11(1):157. doi: 10.1038/s41467-019-13973-x.
8
Protein-Centric Analysis of Personalized Antibody Repertoires Using LC-MS-Based Fab-Profiling on a timsTOF.基于 timsTOF 上的 LC-MS 技术的 Fab 谱分析,实现个性化抗体库的蛋白组学分析。
J Am Soc Mass Spectrom. 2024 Jun 5;35(6):1292-1300. doi: 10.1021/jasms.4c00076. Epub 2024 Apr 25.
9
Label-free protein quantification using LC-coupled ion trap or FT mass spectrometry: Reproducibility, linearity, and application with complex proteomes.使用液相色谱联用离子阱或傅里叶变换质谱进行无标记蛋白质定量:重现性、线性及在复杂蛋白质组中的应用
J Proteome Res. 2006 May;5(5):1214-23. doi: 10.1021/pr050406g.
10
Experimental design and data-analysis in label-free quantitative LC/MS proteomics: A tutorial with MSqRob.无标记定量 LC/MS 蛋白质组学中的实验设计和数据分析:MSqRob 教程。
J Proteomics. 2018 Jan 16;171:23-36. doi: 10.1016/j.jprot.2017.04.004. Epub 2017 Apr 5.

引用本文的文献

1
Longitudinal plasma proteomic analysis of 1117 hospitalized patients with COVID-19 identifies features associated with severity and outcomes.对 1117 名住院 COVID-19 患者的纵向血浆蛋白质组学分析鉴定出与严重程度和结局相关的特征。
Sci Adv. 2024 May 24;10(21):eadl5762. doi: 10.1126/sciadv.adl5762.
2
Analytical challenges in omics research on asthma and allergy: A National Institute of Allergy and Infectious Diseases workshop.组学研究哮喘和过敏的分析挑战:美国过敏和传染病研究所研讨会。
J Allergy Clin Immunol. 2024 Apr;153(4):954-968. doi: 10.1016/j.jaci.2024.01.014. Epub 2024 Jan 29.
3
MD-Ligand-Receptor: A High-Performance Computing Tool for Characterizing Ligand-Receptor Binding Interactions in Molecular Dynamics Trajectories.MD-Ligand-Receptor:一种用于在分子动力学轨迹中描述配体-受体结合相互作用的高性能计算工具。
Int J Mol Sci. 2023 Jul 19;24(14):11671. doi: 10.3390/ijms241411671.
4
Multi-omic longitudinal study reveals immune correlates of clinical course among hospitalized COVID-19 patients.多组学纵向研究揭示了住院 COVID-19 患者临床病程的免疫相关性。
Cell Rep Med. 2023 Jun 20;4(6):101079. doi: 10.1016/j.xcrm.2023.101079. Epub 2023 May 23.
5
A simple, time- and cost-effective, high-throughput depletion strategy for deep plasma proteomics.一种简单、经济高效、高通量的深血浆蛋白质组学耗竭策略。
Sci Adv. 2023 Mar 29;9(13):eadf9717. doi: 10.1126/sciadv.adf9717.

本文引用的文献

1
Immunophenotyping assessment in a COVID-19 cohort (IMPACC): A prospective longitudinal study.免疫表型评估在 COVID-19 队列中的应用(IMPACC):一项前瞻性纵向研究。
Sci Immunol. 2021 Aug 10;6(62). doi: 10.1126/sciimmunol.abf3733.
2
IonQuant Enables Accurate and Sensitive Label-Free Quantification With FDR-Controlled Match-Between-Runs.IonQuant 实现了基于 FDR 控制的匹配运行间精确、灵敏的无标记定量分析。
Mol Cell Proteomics. 2021;20:100077. doi: 10.1016/j.mcpro.2021.100077. Epub 2021 Apr 2.
3
Philosopher: a versatile toolkit for shotgun proteomics data analysis.哲学家:用于鸟枪法蛋白质组学数据分析的多功能工具包。
Nat Methods. 2020 Sep;17(9):869-870. doi: 10.1038/s41592-020-0912-y.
4
Fast Quantitative Analysis of timsTOF PASEF Data with MSFragger and IonQuant.使用 MSFragger 和 IonQuant 进行 timsTOF PASEF 数据的快速定量分析
Mol Cell Proteomics. 2020 Sep;19(9):1575-1585. doi: 10.1074/mcp.TIR120.002048. Epub 2020 Jul 2.
5
Singularity: Scientific containers for mobility of compute.奇点:用于计算移动性的科学容器。
PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017.
6
MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics.MSFragger:基于质谱的蛋白质组学中实现超快速且全面的肽段鉴定
Nat Methods. 2017 May;14(5):513-520. doi: 10.1038/nmeth.4256. Epub 2017 Apr 10.
7
MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification.MaxQuant可实现高肽段鉴定率、个体化的百万分之一级质量精度以及全蛋白质组范围的蛋白质定量。
Nat Biotechnol. 2008 Dec;26(12):1367-72. doi: 10.1038/nbt.1511. Epub 2008 Nov 30.
8
A statistical model for identifying proteins by tandem mass spectrometry.一种通过串联质谱法鉴定蛋白质的统计模型。
Anal Chem. 2003 Sep 1;75(17):4646-58. doi: 10.1021/ac0341261.