• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于大规模分析非同义单核苷酸变异的个性化蛋白质基因组数据库的构建与评估

Construction and assessment of individualized proteogenomic databases for large-scale analysis of nonsynonymous single nucleotide variants.

作者信息

Krug Karsten, Popic Sasa, Carpy Alejandro, Taumer Christoph, Macek Boris

机构信息

Proteome Center Tuebingen, University of Tuebingen, Germany.

出版信息

Proteomics. 2014 Dec;14(23-24):2699-708. doi: 10.1002/pmic.201400219. Epub 2014 Nov 17.

DOI:10.1002/pmic.201400219
PMID:25251379
Abstract

Next-generation sequencing projects focusing on genomes and transcriptomes identify millions of single nucleotide variants (SNVs), many of which result in single amino acid substitutions. These nonsynonymous (ns) SNVs are typically not incorporated into protein sequence databases used to identify MS/MS data. Here, we perform a comparative analysis of the assembly of nsSNV-containing proteogenomic databases. We use a comprehensive transcriptome and proteome dataset of HeLa cells from the literature to derive and to incorporate SNVs into databases applicable to proteomics search engines, and to assess their performance in the identification of nsSNVs. We assemble the databases by (1) translation of SNV-containing transcripts into all possible reading frames, (2) translation of predicted reading frame, (3) prediction of nsSNVs and subsequent incorporation into canonical protein sequences. We show substantial differences between generated databases in terms of represented nsSNVs and theoretical search space, affecting sensitivity and specificity of database search. We query the databases with >2.2M high-resolution MS/MS spectra using MaxQuant software and identify 451 variant peptides, containing 401 nsSNVs. We conclude that prediction of reading frame and, if applicable, SNV effect result in comprehensive yet compact databases necessary to retain sensitivity in large-scale analysis of nsSNVs called from transcriptomics data.

摘要

专注于基因组和转录组的新一代测序项目识别出数百万个单核苷酸变异(SNV),其中许多会导致单个氨基酸替换。这些非同义(ns)SNV通常不会纳入用于识别串联质谱(MS/MS)数据的蛋白质序列数据库。在此,我们对包含nsSNV的蛋白质基因组数据库的组装进行了比较分析。我们利用文献中HeLa细胞的综合转录组和蛋白质组数据集,推导SNV并将其纳入适用于蛋白质组学搜索引擎的数据库,并评估它们在识别nsSNV方面的性能。我们通过以下方式组装数据库:(1)将包含SNV的转录本翻译成所有可能的阅读框;(2)翻译预测的阅读框;(3)预测nsSNV并随后纳入标准蛋白质序列。我们发现,生成的数据库在代表的nsSNV和理论搜索空间方面存在显著差异,这会影响数据库搜索的灵敏度和特异性。我们使用MaxQuant软件用超过220万个高分辨率MS/MS谱查询这些数据库,并鉴定出451个变异肽段,其中包含401个nsSNV。我们得出结论,阅读框预测以及(如适用)SNV效应会产生全面而紧凑的数据库,这对于在从转录组学数据中调用的nsSNV的大规模分析中保持灵敏度是必要的。

相似文献

1
Construction and assessment of individualized proteogenomic databases for large-scale analysis of nonsynonymous single nucleotide variants.用于大规模分析非同义单核苷酸变异的个性化蛋白质基因组数据库的构建与评估
Proteomics. 2014 Dec;14(23-24):2699-708. doi: 10.1002/pmic.201400219. Epub 2014 Nov 17.
2
Mass spectrum sequential subtraction speeds up searching large peptide MS/MS spectra datasets against large nucleotide databases for proteogenomics.质谱序列减法可加快针对大型核苷酸数据库搜索大型肽 MS/MS 光谱数据集的速度,用于蛋白质基因组学研究。
Genes Cells. 2012 Aug;17(8):633-44. doi: 10.1111/j.1365-2443.2012.01615.x. Epub 2012 Jun 12.
3
An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer.癌症中体细胞突变和新型剪接事件的蛋白质基因组图谱敏感性分析
Mol Cell Proteomics. 2016 Mar;15(3):1060-71. doi: 10.1074/mcp.M115.056226. Epub 2015 Dec 2.
4
Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses.紧凑的富含变体的定制序列数据库以及用于高效蛋白质基因组分析的快速灵敏的数据库搜索。
Proteomics. 2014 Dec;14(23-24):2742-9. doi: 10.1002/pmic.201400225. Epub 2014 Nov 19.
5
Proteogenomics in microbiology: taking the right turn at the junction of genomics and proteomics.微生物学中的蛋白质基因组学:在基因组学与蛋白质组学的交叉点上做出正确转向。
Proteomics. 2014 Dec;14(23-24):2360-675. doi: 10.1002/pmic.201400168. Epub 2014 Nov 19.
6
Impact of Nonsynonymous Single-Nucleotide Variations on Post-Translational Modification Sites in Human Proteins.非同义单核苷酸变异对人类蛋白质翻译后修饰位点的影响
Methods Mol Biol. 2017;1558:159-190. doi: 10.1007/978-1-4939-6783-4_8.
7
Proteogenomics of Malignant Melanoma Cell Lines: The Effect of Stringency of Exome Data Filtering on Variant Peptide Identification in Shotgun Proteomics.恶性黑素瘤细胞系的蛋白质基因组学:外显子数据过滤严格程度对 shotgun 蛋白质组学中变异肽鉴定的影响。
J Proteome Res. 2018 May 4;17(5):1801-1811. doi: 10.1021/acs.jproteome.7b00841. Epub 2018 Apr 16.
8
Proteome-wide onco-proteogenomic somatic variant identification in ER-positive breast cancer.雌激素受体阳性乳腺癌中全蛋白质组肿瘤蛋白质基因组体细胞变异鉴定
Clin Biochem. 2019 Apr;66:63-75. doi: 10.1016/j.clinbiochem.2019.01.005. Epub 2019 Jan 23.
9
Proteogenomics-Guided Evaluation of RNA-Seq Assembly and Protein Database Construction for Emergent Model Organisms.基于蛋白质基因组学的新兴模式生物 RNA-Seq 组装和蛋白质数据库构建评估。
Proteomics. 2020 May;20(10):e1900261. doi: 10.1002/pmic.201900261. Epub 2020 May 18.
10
Proteogenomics from a bioinformatics angle: A growing field.从生物信息学角度看蛋白质基因组学:一个不断发展的领域。
Mass Spectrom Rev. 2017 Sep;36(5):584-599. doi: 10.1002/mas.21483. Epub 2015 Dec 15.

引用本文的文献

1
Chemoproteogenomic stratification of the missense variant cysteinome.错义变异半胱氨酸组的化学蛋白质基因组分层分析。
Nat Commun. 2024 Oct 28;15(1):9284. doi: 10.1038/s41467-024-53520-x.
2
Moving Toward Metaproteogenomics: A Computational Perspective on Analyzing Microbial Samples via Proteogenomics.迈向宏蛋白质组学:通过蛋白质组学分析微生物样本的计算视角。
Methods Mol Biol. 2025;2859:297-318. doi: 10.1007/978-1-0716-4152-1_17.
3
Multi-omic stratification of the missense variant cysteinome.错义变异半胱氨酸组的多组学分层
bioRxiv. 2023 Aug 14:2023.08.12.553095. doi: 10.1101/2023.08.12.553095.
4
ProteomeGenerator: A Framework for Comprehensive Proteomics Based on de Novo Transcriptome Assembly and High-Accuracy Peptide Mass Spectral Matching.蛋白质组生成器:基于从头转录组组装和高精度肽质量谱匹配的综合蛋白质组学框架。
J Proteome Res. 2018 Nov 2;17(11):3681-3692. doi: 10.1021/acs.jproteome.8b00295. Epub 2018 Oct 19.
5
Integrating Next-Generation Genomic Sequencing and Mass Spectrometry To Estimate Allele-Specific Protein Abundance in Human Brain.将下一代基因组测序和质谱技术集成,以估计人类大脑中等位基因特异性蛋白质丰度。
J Proteome Res. 2017 Sep 1;16(9):3336-3347. doi: 10.1021/acs.jproteome.7b00324. Epub 2017 Aug 9.
6
Methods, Tools and Current Perspectives in Proteogenomics.蛋白质基因组学中的方法、工具及当前观点
Mol Cell Proteomics. 2017 Jun;16(6):959-981. doi: 10.1074/mcp.MR117.000024. Epub 2017 Apr 29.
7
Single Amino Acid Variant Profiles of Subpopulations in the MCF-7 Breast Cancer Cell Line.MCF-7乳腺癌细胞系亚群的单氨基酸变异谱
J Proteome Res. 2017 Feb 3;16(2):842-851. doi: 10.1021/acs.jproteome.6b00824. Epub 2017 Jan 20.
8
Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation.蛋白质基因组学:整合新一代测序技术与质谱技术以表征人类蛋白质组变异
Annu Rev Anal Chem (Palo Alto Calif). 2016 Jun 12;9(1):521-45. doi: 10.1146/annurev-anchem-071015-041722. Epub 2016 Mar 30.
9
MSProGene: integrative proteogenomics beyond six-frames and single nucleotide polymorphisms.MSProGene:超越六框架和单核苷酸多态性的整合蛋白质基因组学
Bioinformatics. 2015 Jun 15;31(12):i106-15. doi: 10.1093/bioinformatics/btv236.