Suppr超能文献

基因组难处理的重复区域中的遗传变异。

Genetic variation in recalcitrant repetitive regions of the genome.

作者信息

Shukla Harsh G, Chakraborty Mahul, Emerson J J

机构信息

Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA.

Graduate Program in Mathematical, Computational and Systems Biology, University of California Irvine, Irvine, California 92697, USA.

出版信息

Genome Res. 2025 Aug 5. doi: 10.1101/gr.280728.125.

Abstract

Many essential functions of organisms are encoded in highly repetitive genomic regions, including histones involved in DNA packaging, centromeres that are core components of chromosome segregation, ribosomal RNA comprising the protein translation machinery, telomeres that ensure chromosome integrity, piRNA clusters encoding host defenses against selfish elements, and virtually the entire Y Chromosome. These regions, formed by highly similar tandem arrays, pose significant challenges for experimental and computational studies, impeding sequence-level descriptions essential for understanding genetic variation. Here, we report the assembly and variation analysis of such repetitive regions in , offering significant improvements to the existing community reference assembly. Our work successfully recovers previously elusive segments, including complete reconstructions of the histone locus and the pericentric heterochromatin of the X Chromosome, spanning the locus to the distal flank of the rDNA cluster. To infer structural changes in these regions where alignments are often not practicable, we introduce landmark anchors based on unique variants that are putatively orthologous. These regions display considerable structural variation between different strains, exhibiting differences in copy number and organization of homologous repeat units between haplotypes. In the histone cluster, although we observe minimal genetic exchange indicative of meiotic crossing over, the variation patterns suggest mechanisms such as unequal sister chromatid exchange. We also examine the prevalence and scale of concerted evolution in the histone and clusters and discuss the mechanisms underlying these observed patterns.

摘要

生物体的许多基本功能都编码在高度重复的基因组区域中,包括参与DNA包装的组蛋白、作为染色体分离核心组件的着丝粒、构成蛋白质翻译机制的核糖体RNA、确保染色体完整性的端粒、编码宿主抵御自私元件的piRNA簇,以及几乎整个Y染色体。这些由高度相似的串联阵列形成的区域,给实验和计算研究带来了重大挑战,阻碍了对理解遗传变异至关重要的序列水平描述。在这里,我们报告了[具体物种]中此类重复区域的组装和变异分析,对现有的群体参考组装有了显著改进。我们的工作成功恢复了以前难以捉摸的片段,包括组蛋白基因座和X染色体着丝粒周围异染色质的完整重建,跨越[具体基因座]到rDNA簇的远端侧翼。为了推断这些区域中通常无法进行比对的结构变化,我们基于假定直系同源的独特变异引入了地标锚点。这些区域在不同的[具体物种]菌株之间表现出相当大的结构变异,在单倍型之间的同源重复单元的拷贝数和组织上存在差异。在组蛋白簇中,尽管我们观察到表明减数分裂交叉的最小遗传交换,但变异模式表明存在不等姐妹染色单体交换等机制。我们还研究了组蛋白和[具体基因簇]中协同进化的普遍性和规模,并讨论了这些观察到的模式背后的机制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验