Suppr超能文献

长读长组装揭示人类和三种大猿类中加工假基因获取的更高速率。

Higher Rates of Processed Pseudogene Acquisition in Humans and Three Great Apes Revealed by Long-Read Assemblies.

机构信息

Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

出版信息

Mol Biol Evol. 2021 Jun 25;38(7):2958-2966. doi: 10.1093/molbev/msab062.

Abstract

LINE-1-mediated retrotransposition of protein-coding mRNAs is an active process in modern humans for both germline and somatic genomes. Prior works that surveyed human data mostly relied on detecting discordant mappings of paired-end short reads, or exon junctions contained in short reads. Moreover, there have been few genome-wide comparisons between gene retrocopies in great apes and humans. In this study, we introduced a more sensitive and accurate method to identify processed pseudogenes. Our method utilizes long-read assemblies, and more importantly, is able to provide full-length retrocopy sequences as well as flanking regions which are missed by short-read based methods. From 22 human individuals, we pinpointed 40 processed pseudogenes that are not present in the human reference genome GRCh38 and identified 17 pseudogenes that are in GRCh38 but absent from some input individuals. This represents a significantly higher discovery rate than previous reports (39 pseudogenes not in the reference genome out of 939 individuals). We also provided an overview of lineage-specific retrocopies in chimpanzee, gorilla, and orangutan genomes.

摘要

LINE-1 介导的蛋白编码 mRNA 的反转录转座是现代人类生殖系和体细胞基因组中一种活跃的过程。之前的研究主要依赖于检测成对短读序列的不一致映射,或短读序列中包含的外显子连接。此外,在大型猿类和人类之间的基因返座体之间进行全基因组比较的研究很少。在这项研究中,我们引入了一种更敏感和准确的方法来识别加工假基因。我们的方法利用长读序列组装,更重要的是,能够提供全长返座序列以及侧翼区域,而这些区域是基于短读序列的方法所缺失的。从 22 个人类个体中,我们确定了 40 个不存在于人类参考基因组 GRCh38 中的加工假基因,并鉴定了 17 个存在于 GRCh38 但不存在于一些输入个体中的假基因。这代表了比以前的报告(在 939 个人类个体中,有 39 个不在参考基因组中的假基因)更高的发现率。我们还概述了黑猩猩、大猩猩和猩猩基因组中的谱系特异性返座体。

相似文献

6
Discovery of non-reference processed pseudogenes in the Swedish population.瑞典人群中非参考加工假基因的发现。
Front Genet. 2023 May 30;14:1176626. doi: 10.3389/fgene.2023.1176626. eCollection 2023.
9
NANOGP8: evolution of a human-specific retro-oncogene.NANOGP8:一个人类特异性逆转录病毒癌基因的进化。
G3 (Bethesda). 2012 Nov;2(11):1447-57. doi: 10.1534/g3.112.004366. Epub 2012 Nov 1.

引用本文的文献

8
Ancient segmentally duplicated LCORL retrocopies in equids.马类中的古老分段重复 LCORL 返座。
PLoS One. 2023 Jun 8;18(6):e0286861. doi: 10.1371/journal.pone.0286861. eCollection 2023.
10
Pseudogenes and Liquid Phase Separation in Epigenetic Expression.表观遗传表达中的假基因与液相分离
Front Oncol. 2022 Jul 8;12:912282. doi: 10.3389/fonc.2022.912282. eCollection 2022.

本文引用的文献

2
Chromosome-scale, haplotype-resolved assembly of human genomes.人类基因组的染色体规模、单倍型解析组装。
Nat Biotechnol. 2021 Mar;39(3):309-312. doi: 10.1038/s41587-020-0711-0. Epub 2020 Dec 7.
5
Overcoming challenges and dogmas to understand the functions of pseudogenes.克服挑战和教条,理解假基因的功能。
Nat Rev Genet. 2020 Mar;21(3):191-201. doi: 10.1038/s41576-019-0196-1. Epub 2019 Dec 17.
7
Long-read sequence and assembly of segmental duplications.长读序列和串联重复序列的组装。
Nat Methods. 2019 Jan;16(1):88-94. doi: 10.1038/s41592-018-0236-3. Epub 2018 Dec 17.
8
GENCODE reference annotation for the human and mouse genomes.GENCODE 人类和小鼠基因组参考注释。
Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773. doi: 10.1093/nar/gky955.
10
Ensembl 2018.Ensembl 2018.
Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761. doi: 10.1093/nar/gkx1098.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验