临床蛋白质组肿瘤分析联盟（CPTAC）通用数据分析流程说明。

A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline.

作者信息

Rudnick Paul A, Markey Sanford P, Roth Jeri, Mirokhin Yuri, Yan Xinjian, Tchekhovskoi Dmitrii V, Edwards Nathan J, Thangudu Ratna R, Ketchum Karen A, Kinsinger Christopher R, Mesri Mehdi, Rodriguez Henry, Stein Stephen E

机构信息

Spectragen Informatics, Bainbridge Island, Washington 98110, United States.

Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States.

出版信息

J Proteome Res. 2016 Mar 4;15(3):1023-32. doi: 10.1021/acs.jproteome.5b01091. Epub 2016 Feb 25.

DOI:10.1021/acs.jproteome.5b01091

PMID:26860878

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5117628/

Abstract

The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.

摘要

临床蛋白质组肿瘤分析联盟（CPTAC）通过对先前由癌症基因组图谱（TCGA）项目分析过的肿瘤样本进行质谱分析，生成了大量蛋白质组学数据集。基因组和蛋白质组数据的可用性使得对癌症的参考标记（即主要序列数据库中包含的标记）和非参考标记进行蛋白质基因组学研究成为可能。CPTAC实验室在第一轮分析中重点关注结肠、乳腺和卵巢组织；这些数据集的光谱是通过二维液相色谱-串联质谱分析产生的，具有深度覆盖。为了减少不同数据分析平台（如软件包、版本、参数、序列数据库等）引入的变异性，创建了CPTAC通用数据分析平台（CDAP）。CDAP生成肽谱匹配（PSM）报告和基因水平报告。该流程根据以下步骤处理原始质谱数据：（1）峰检测和定量数据提取，（2）数据库搜索，（3）基于基因的蛋白质简约分析，以及（4）基于错误发现率的过滤。该流程还使用PhosphoRS程序为磷酸肽富集研究生成定位分数。每个数据集的定量信息特定于样本处理，PSM和蛋白质报告包含光谱水平或基因水平（“汇总”）的前体峰面积以及无标记或4重iTRAQ报告离子对数比的光谱计数。报告以简单的制表符分隔格式提供，对于PSM报告，以mzIdentML格式提供。CDAP的目标是为所有CPTAC数据提供标准、统一的报告，以便能够在不同样本和癌症类型之间以及跨主要组学领域进行比较。

相似文献

A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline.

J Proteome Res. 2016 Mar 4;15(3):1023-32. doi: 10.1021/acs.jproteome.5b01091. Epub 2016 Feb 25.

The CPTAC Data Portal: A Resource for Cancer Proteomics Research.

J Proteome Res. 2015 Jun 5;14(6):2707-13. doi: 10.1021/pr501254j. Epub 2015 May 4.

Integration and Analysis of CPTAC Proteomics Data in the Context of Cancer Genomics in the cBioPortal.

Mol Cell Proteomics. 2019 Sep;18(9):1893-1898. doi: 10.1074/mcp.TIR119.001673. Epub 2019 Jul 15.

TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data.

Bioinformatics. 2018 May 1;34(9):1615-1617. doi: 10.1093/bioinformatics/btx812.

Reproducibility of Differential Proteomic Technologies in CPTAC Fractionated Xenografts.

J Proteome Res. 2016 Mar 4;15(3):691-706. doi: 10.1021/acs.jproteome.5b00859. Epub 2015 Dec 22.

Platform-independent and label-free quantitation of proteomic data using MS1 extracted ion chromatograms in skyline: application to protein acetylation and phosphorylation.

Mol Cell Proteomics. 2012 May;11(5):202-14. doi: 10.1074/mcp.M112.017707. Epub 2012 Mar 26.

Simplified and Unified Access to Cancer Proteogenomic Data.

J Proteome Res. 2021 Apr 2;20(4):1902-1910. doi: 10.1021/acs.jproteome.0c00919. Epub 2021 Feb 9.

Bioinformatics Analysis of Global Proteomic and Phosphoproteomic Data Sets Revealed Activation of NEK2 and AURKA in Cancers.

Biomolecules. 2020 Feb 4;10(2):237. doi: 10.3390/biom10020237.

GLIO-Select: Machine Learning-Based Feature Selection and Weighting of Tissue and Serum Proteomic and Metabolomic Data Uncovers Sex Differences in Glioblastoma.

Int J Mol Sci. 2025 May 2;26(9):4339. doi: 10.3390/ijms26094339.

Open source libraries and frameworks for mass spectrometry based proteomics: a developer's perspective.

Biochim Biophys Acta. 2014 Jan;1844(1 Pt A):63-76. doi: 10.1016/j.bbapap.2013.02.032. Epub 2013 Mar 1.

引用本文的文献

Informatics at the Frontier of Cancer Research.

Cancer Res. 2025 Aug 15;85(16):2967-2986. doi: 10.1158/0008-5472.CAN-24-2829.

UBE3C promotes pancreatic ductal adenocarcinoma progression by catalysing p53 ubiquitination.

Mol Biol Rep. 2025 Jun 24;52(1):633. doi: 10.1007/s11033-025-10751-5.

Loss of VHL-mediated pRb regulation promotes clear cell renal cell carcinoma.

Cell Death Dis. 2025 Apr 16;16(1):307. doi: 10.1038/s41419-025-07623-y.

A Multi-Omics Framework for Survival Mediation Analysis of High-Dimensional Proteogenomic Data.

ArXiv. 2025 Mar 11:arXiv:2503.08606v1.

Advancements in proteogenomics for preclinical targeted cancer therapy research.

Biophys Rep. 2025 Feb 28;11(1):56-76. doi: 10.52601/bpr.2024.240053.

CircMETTL6 Suppresses Ovarian Cancer Cell Growth and Metastasis Through Inhibition of GDF15 Transcription by Disrupting the NONO-POLR2A Complex.

Adv Sci (Weinh). 2025 Mar;12(12):e2411717. doi: 10.1002/advs.202411717. Epub 2025 Feb 3.

Deciphering the potential ability of DExD/H-box helicase 60 (DDX60) on the proliferation, diagnostic and prognostic biomarker in pancreatic cancer: a research based on silico, RNA-seq and molecular biology experiment.

Hereditas. 2025 Jan 22;162(1):6. doi: 10.1186/s41065-024-00361-9.

Comprehensive Analysis Reveals That ISCA1 Is Correlated with Ferroptosis-Related Genes Across Cancers and Is a Biomarker in Thyroid Carcinoma.

Genes (Basel). 2024 Nov 28;15(12):1538. doi: 10.3390/genes15121538.

The role of KRT18 in lung adenocarcinoma development: integrative bioinformatics and experimental validation.

Discov Oncol. 2024 Dec 27;15(1):841. doi: 10.1007/s12672-024-01728-0.

Reference Materials for Improving Reliability of Multiomics Profiling.

Phenomics. 2024 Mar 6;4(5):487-521. doi: 10.1007/s43657-023-00153-7. eCollection 2024 Oct.

本文引用的文献

The CPTAC Data Portal: A Resource for Cancer Proteomics Research.

J Proteome Res. 2015 Jun 5;14(6):2707-13. doi: 10.1021/pr501254j. Epub 2015 May 4.

Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics.

Proteomics Clin Appl. 2015 Aug;9(7-8):745-54. doi: 10.1002/prca.201400164. Epub 2015 Apr 2.

A standardized framing for reporting protein identifications in mzIdentML 1.2.

Proteomics. 2014 Nov;14(21-22):2389-99. doi: 10.1002/pmic.201400080. Epub 2014 Sep 23.

Proteogenomic characterization of human colon and rectal cancer.

Nature. 2014 Sep 18;513(7518):382-7. doi: 10.1038/nature13438. Epub 2014 Jul 20.

Accurate identification of deamidated peptides in global proteomics using a quadrupole orbitrap mass spectrometer.

J Proteome Res. 2014 Feb 7;13(2):777-85. doi: 10.1021/pr400848n. Epub 2013 Dec 12.

A cross-platform toolkit for mass spectrometry and proteomics.

Nat Biotechnol. 2012 Oct;30(10):918-20. doi: 10.1038/nbt.2377.

Pepitome: evaluating improved spectral library search for identification complementarity and quality assessment.

J Proteome Res. 2012 Mar 2;11(3):1686-95. doi: 10.1021/pr200874e. Epub 2012 Jan 27.

Universal and confident phosphorylation site localization using phosphoRS.

J Proteome Res. 2011 Dec 2;10(12):5354-62. doi: 10.1021/pr200611n. Epub 2011 Nov 10.

The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.

Mol Cell Proteomics. 2010 Dec;9(12):2840-52. doi: 10.1074/mcp.M110.003731. Epub 2010 Sep 9.

Addressing accuracy and precision issues in iTRAQ quantitation.

Mol Cell Proteomics. 2010 Sep;9(9):1885-97. doi: 10.1074/mcp.M900628-MCP200. Epub 2010 Apr 10.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

临床蛋白质组肿瘤分析联盟（CPTAC）通用数据分析流程说明。

A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献