Suppr超能文献

灵长类基因组的结构不同且反复突变的区域。

Structurally divergent and recurrently mutated regions of primate genomes.

作者信息

Mao Yafei, Harvey William T, Porubsky David, Munson Katherine M, Hoekzema Kendra, Lewis Alexandra P, Audano Peter A, Rozanski Allison, Yang Xiangyu, Zhang Shilong, Gordon David S, Wei Xiaoxi, Logsdon Glennis A, Haukness Marina, Dishuck Philip C, Jeong Hyeonsoo, Del Rosario Ricardo, Bauer Vanessa L, Fattor Will T, Wilkerson Gregory K, Lu Qing, Paten Benedict, Feng Guoping, Sawyer Sara L, Warren Wesley C, Carbone Lucia, Eichler Evan E

机构信息

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.

出版信息

bioRxiv. 2023 Mar 7:2023.03.07.531415. doi: 10.1101/2023.03.07.531415.

Abstract

To better understand the pattern of primate genome structural variation, we sequenced and assembled using multiple long-read sequencing technologies the genomes of eight nonhuman primate species, including New World monkeys (owl monkey and marmoset), Old World monkey (macaque), Asian apes (orangutan and gibbon), and African ape lineages (gorilla, bonobo, and chimpanzee). Compared to the human genome, we identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. Across 50 million years of primate evolution, we estimate that 819.47 Mbp or ~27% of the genome has been affected by SVs based on analysis of these primate lineages. We identify 1,607 structurally divergent regions (SDRs) wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (, , ) and new lineage-specific genes are generated (e.g., , ) and have become targets of rapid chromosomal diversification and positive selection (e.g., s). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species for the first time.

摘要

为了更好地理解灵长类动物基因组结构变异的模式,我们使用多种长读长测序技术对8种非人类灵长类动物的基因组进行了测序和组装,这些物种包括新大陆猴(夜猴和狨猴)、旧大陆猴(猕猴)、亚洲猿(猩猩和长臂猿)以及非洲猿谱系(大猩猩、倭黑猩猩和黑猩猩)。与人类基因组相比,我们鉴定出1338997个谱系特异性固定结构变异(SVs),这些变异破坏了1561个蛋白质编码基因和136932个调控元件,其中包括最完整的人类特异性固定差异集。基于对这些灵长类谱系的分析,在5000万年的灵长类进化过程中,我们估计基因组中8.1947亿碱基对或约27%受到了SVs的影响。我们鉴定出1607个结构差异区域(SDRs),其中反复出现的结构变异导致了SV热点的形成,在这些热点区域基因反复丢失(……),新的谱系特异性基因产生(例如……),并且已成为快速染色体多样化和正选择的靶点(例如……)。高保真长读长测序首次使灵长类物种内部和之间基因组的这些动态区域能够进行序列水平的分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa23/10028934/859930fc04e0/nihpp-2023.03.07.531415v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验