Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada.
Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada.
Methods Mol Biol. 2024;2836:3-17. doi: 10.1007/978-1-0716-4007-4_1.
Proteogenomics has revealed the translation of unannotated open reading frames (ORFs) present in mRNAs and in noncoding RNAs (ncRNAs). OpenProt annotates all ORFs with a minimum of 30 codons in the transcriptome of several species and displays many functional features associated with the corresponding proteins. Two types of proteins are annotated: reference or canonical proteins which are proteins already annotated in UniProt, RefSeq, or Ensembl and noncanonical proteins. Noncanonical proteins form two groups: predicted novel isoforms that display a significant level of homology with a reference protein and alternative proteins that are new proteins with no significant homology to known proteins. This chapter describes how to check whether a gene and/or transcript contains multiple open reading frames and how to use OpenProt databases for the detection of alternative proteins and novel isoforms by mass spectrometry-based proteomics.
蛋白质基因组学揭示了在信使 RNA 和非编码 RNA (ncRNA) 中翻译未注释的开放阅读框 (ORF)。OpenProt 在几种物种的转录组中注释所有至少含有 30 个密码子的 ORF,并显示与相应蛋白质相关的许多功能特征。注释了两种类型的蛋白质:参考或规范蛋白质,这些蛋白质已经在 UniProt、RefSeq 或 Ensembl 中注释,以及非规范蛋白质。非规范蛋白质分为两类:显示与参考蛋白质显著同源性的预测新型异构体,以及与已知蛋白质没有显著同源性的新型蛋白质。本章描述了如何检查一个基因和/或转录物是否包含多个开放阅读框,以及如何使用 OpenProt 数据库通过基于质谱的蛋白质组学检测替代蛋白质和新型异构体。