• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对肽图谱数据库的深入审查发现了未注释编码基因和异常翻译的证据。

A deep audit of the PeptideAtlas database uncovers evidence for unannotated coding genes and aberrant translation.

作者信息

Rodriguez Jose Manuel, Maquedano Miguel, Cerdan-Velez Daniel, Calvo Enrique, Vazquez Jesús, Tress Michael L

机构信息

Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain.

CIBER de Enfermedades Cardiovasculares (CIBERCV), 28029 Madrid, Spain.

出版信息

bioRxiv. 2024 Nov 15:2024.11.14.623419. doi: 10.1101/2024.11.14.623419.

DOI:10.1101/2024.11.14.623419
PMID:39605392
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11601488/
Abstract

The human genome has been the subject of intense scrutiny by experimental and manual curation projects for more than two decades. Novel coding genes have been proposed from large-scale RNASeq, ribosome profiling and proteomics experiments. Here we carry out an in-depth analysis of an entire proteomics database. We analysed the proteins, peptides and spectra housed in the human build of the PeptideAtlas proteomics database to identify coding regions that are not yet annotated in the GENCODE reference gene set. We find support for hundreds of missing alternative protein isoforms and unannotated upstream translations, and evidence of cross-contamination from other species. There was reliable peptide evidence for 34 novel unannotated open reading frames (ORFs) in PeptideAtlas. We find that almost half belong to coding genes that are missing from GENCODE and other reference sets. Most of the remaining ORFs were not conserved beyond human, however, and their peptide confirmation was restricted to cancer cell lines. We show that this is strong evidence for aberrant translation, raising important questions about the extent of aberrant translation and how these ORFs should be annotated in reference genomes.

摘要

二十多年来,人类基因组一直是实验和人工整理项目深入研究的对象。通过大规模RNA测序、核糖体分析和蛋白质组学实验,人们提出了新的编码基因。在此,我们对整个蛋白质组学数据库进行了深入分析。我们分析了PeptideAtlas蛋白质组学数据库人类版本中包含的蛋白质、肽段和质谱图,以识别GENCODE参考基因集中尚未注释的编码区域。我们发现了数百种缺失的可变蛋白质异构体和未注释的上游翻译的证据,以及来自其他物种的交叉污染迹象。在PeptideAtlas中,有可靠的肽段证据支持34个新的未注释开放阅读框(ORF)。我们发现,几乎一半的开放阅读框属于GENCODE和其他参考集中缺失的编码基因。然而,其余的开放阅读框大多在人类之外并不保守,其肽段确认仅限于癌细胞系。我们表明,这是异常翻译的有力证据,引发了关于异常翻译程度以及这些开放阅读框应如何在参考基因组中注释的重要问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/131a510b39bf/nihpp-2024.11.14.623419v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/8860d3c4a6ed/nihpp-2024.11.14.623419v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/687fc5182f5e/nihpp-2024.11.14.623419v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/8544d57f25ed/nihpp-2024.11.14.623419v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/a7405d8f77aa/nihpp-2024.11.14.623419v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/814ac9280e17/nihpp-2024.11.14.623419v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/131a510b39bf/nihpp-2024.11.14.623419v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/8860d3c4a6ed/nihpp-2024.11.14.623419v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/687fc5182f5e/nihpp-2024.11.14.623419v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/8544d57f25ed/nihpp-2024.11.14.623419v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/a7405d8f77aa/nihpp-2024.11.14.623419v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/814ac9280e17/nihpp-2024.11.14.623419v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f266/11601488/131a510b39bf/nihpp-2024.11.14.623419v1-f0006.jpg

相似文献

1
A deep audit of the PeptideAtlas database uncovers evidence for unannotated coding genes and aberrant translation.对肽图谱数据库的深入审查发现了未注释编码基因和异常翻译的证据。
bioRxiv. 2024 Nov 15:2024.11.14.623419. doi: 10.1101/2024.11.14.623419.
2
Evidence for widespread translation of 5' untranslated regions.广泛存在 5' 非翻译区翻译的证据。
Nucleic Acids Res. 2024 Aug 12;52(14):8112-8126. doi: 10.1093/nar/gkae571.
3
The Tomato Translational Landscape Revealed by Transcriptome Assembly and Ribosome Profiling.通过转录组组装和核糖体分析揭示番茄的翻译全景。
Plant Physiol. 2019 Sep;181(1):367-380. doi: 10.1104/pp.19.00541. Epub 2019 Jun 27.
4
Common and phylogenetically widespread coding for peptides by bacterial small RNAs.细菌小RNA对肽进行编码的现象普遍存在且在系统发育上广泛存在。
BMC Genomics. 2017 Jul 21;18(1):553. doi: 10.1186/s12864-017-3932-y.
5
Comparative proteomics reveals a significant bias toward alternative protein isoforms with conserved structure and function.比较蛋白质组学揭示了一种明显的偏向,即偏向于具有保守结构和功能的替代蛋白质异构体。
Mol Biol Evol. 2012 Sep;29(9):2265-83. doi: 10.1093/molbev/mss100. Epub 2012 Mar 22.
6
High-quality peptide evidence for annotating non-canonical open reading frames as human proteins.用于将非经典开放阅读框注释为人类蛋白质的高质量肽段证据。
bioRxiv. 2024 Sep 9:2024.09.09.612016. doi: 10.1101/2024.09.09.612016.
7
Improved super-resolution ribosome profiling reveals prevalent translation of upstream ORFs and small ORFs in Arabidopsis.改进的核糖体超分辨图谱分析揭示了拟南芥中上游开放阅读框和小开放阅读框的普遍翻译。
Plant Cell. 2024 Feb 26;36(3):510-539. doi: 10.1093/plcell/koad290.
8
Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci.全基因组 PhyloCSF 发现高可信度的人类蛋白编码基因和外显子,有助于阐明 118 个 GWAS 基因座。
Genome Res. 2019 Dec;29(12):2073-2087. doi: 10.1101/gr.246462.118. Epub 2019 Sep 19.
9
What can Ribo-seq and proteomics tell us about the non-canonical proteome?核糖体测序(Ribo-seq)和蛋白质组学能让我们了解非经典蛋白质组的哪些信息?
bioRxiv. 2023 May 18:2023.05.16.541049. doi: 10.1101/2023.05.16.541049.
10
The Protein Coded by a Short Open Reading Frame, Not by the Annotated Coding Sequence, Is the Main Gene Product of the Dual-Coding Gene .短开放阅读框编码的蛋白,而非注释编码序列,是双编码基因的主要基因产物。
Mol Cell Proteomics. 2018 Dec;17(12):2402-2411. doi: 10.1074/mcp.RA118.000593. Epub 2018 Sep 4.

本文引用的文献

1
Evidence for widespread translation of 5' untranslated regions.广泛存在 5' 非翻译区翻译的证据。
Nucleic Acids Res. 2024 Aug 12;52(14):8112-8126. doi: 10.1093/nar/gkae571.
2
Molecular mechanisms of non-genetic aberrant peptide production in cancer.癌症中非遗传异常肽产生的分子机制。
Oncogene. 2024 Jun;43(27):2053-2062. doi: 10.1038/s41388-024-03069-2. Epub 2024 May 27.
3
Long Noncoding RNA MSL3P1 Regulates CUL3 mRNA Cytoplasmic Transport and Stability and Promotes Lung Adenocarcinoma Metastasis.长链非编码 RNA MSL3P1 调控 CUL3 mRNA 的细胞质运输和稳定性,并促进肺腺癌转移。
Mol Cancer Res. 2024 Aug 2;22(8):746-758. doi: 10.1158/1541-7786.MCR-23-0977.
4
LINC00839 in Human Disorders: Insights into its Regulatory Roles and Clinical Impact, with a Special Focus on Cancer.人类疾病中的LINC00839:对其调控作用和临床影响的见解,特别关注癌症。
J Cancer. 2024 Feb 25;15(8):2179-2192. doi: 10.7150/jca.93820. eCollection 2024.
5
The T2T-CHM13 reference assembly uncovers essential WASH1 and GPRIN2 paralogues.T2T-CHM13参考基因组组装揭示了重要的WASH1和GPRIN2旁系同源基因。
Bioinform Adv. 2024 Feb 28;4(1):vbae029. doi: 10.1093/bioadv/vbae029. eCollection 2024.
6
Mechanisms of Translation-coupled Quality Control.翻译偶联质量控制的机制
J Mol Biol. 2024 Mar 15;436(6):168496. doi: 10.1016/j.jmb.2024.168496. Epub 2024 Feb 15.
7
FlyBase: updates to the Drosophila genes and genomes database.FlyBase:果蝇基因和基因组数据库的更新。
Genetics. 2024 May 7;227(1). doi: 10.1093/genetics/iyad211.
8
Long Interspersed Nuclear Element-1 Analytes in Extracellular Vesicles as Tools for Molecular Diagnostics of Non-Small Cell Lung Cancer.长散在核元件 1 分析物在细胞外囊泡中作为非小细胞肺癌分子诊断的工具。
Int J Mol Sci. 2024 Jan 18;25(2):1169. doi: 10.3390/ijms25021169.
9
Potential biomarkers: The hypomethylation of cg18949415 and cg22193385 sites in colon adenocarcinoma.潜在生物标志物:结肠癌中 cg18949415 和 cg22193385 位点的低甲基化。
Comput Biol Med. 2024 Feb;169:107884. doi: 10.1016/j.compbiomed.2023.107884. Epub 2023 Dec 22.
10
LINE-1 retrotransposition and its deregulation in cancers: implications for therapeutic opportunities.LINE-1 反转录转座及其在癌症中的失调:对治疗机会的影响。
Genes Dev. 2023 Dec 26;37(21-24):948-967. doi: 10.1101/gad.351051.123.