Vasylieva Valeriia, Arefiev Ihor, Bourassa Francis, Trifiro Félix-Antoine, Brunet Marie A
Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada.
Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada.
J Proteome Res. 2024 Dec 6;23(12):5233-5249. doi: 10.1021/acs.jproteome.4c00116. Epub 2024 Nov 1.
Throughout the past decade, technological advances in genomics and transcriptomics have revealed pervasive translation throughout mammalian genomes. These putative proteins are usually excluded from proteomics analyses, as they are absent from common protein repositories. A sizable portion of these noncanonical proteins is translated from pseudogenes. Pseudogenes are commonly termed defective copies of coding genes unable to produce proteins. Here, we suggest that proteomics can help in their annotation. First, we define important terms and review specific examples underlining the caveats in pseudogene annotation and their coding potential. Then, we will discuss the challenges inherent to pseudogenes that have thus far rendered complex their confidence in omics data. Finally, we identify recent developments in experimental procedures, instrumentation, and computational methods in proteomics that put the field in a unique position to solve the pseudogene annotation conundrum.
在过去十年中,基因组学和转录组学的技术进步揭示了哺乳动物基因组中普遍存在的翻译现象。这些假定的蛋白质通常被排除在蛋白质组学分析之外,因为它们在常见的蛋白质数据库中并不存在。这些非规范蛋白质中有相当一部分是从假基因翻译而来的。假基因通常被称为无法产生蛋白质的编码基因的缺陷拷贝。在这里,我们认为蛋白质组学有助于对它们进行注释。首先,我们定义重要术语并回顾一些具体例子,这些例子突显了假基因注释中的注意事项及其编码潜力。然后,我们将讨论假基因固有的挑战,这些挑战迄今为止使得对组学数据的可信度变得复杂。最后,我们确定蛋白质组学在实验程序、仪器设备和计算方法方面的最新进展,这些进展使该领域处于解决假基因注释难题的独特地位。