Suppr超能文献

人类21号染色体T2T-CHM13组装中高阶重复序列(HORs)的精确鉴定——新型52聚体HOR及Hg38组装的失败

Precise Identification of Higher-Order Repeats (HORs) in T2T-CHM13 Assembly of Human Chromosome 21-Novel 52mer HOR and Failures of Hg38 Assembly.

作者信息

Glunčić Matko, Vlahović Ines, Rosandić Marija, Paar Vladimir

机构信息

Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia.

Department of Interdisciplinary Sciences, Algebra University College, 10000 Zagreb, Croatia.

出版信息

Genes (Basel). 2025 Jul 27;16(8):885. doi: 10.3390/genes16080885.

Abstract

BACKGROUND

Centromeric alpha satellite DNA is organized into higher-order repeats (HORs), whose precise structure is often difficult to resolve in standard genome assemblies. The recent telomere-to-telomere (T2T) assembly of the human genome enables complete analysis of centromeric regions, including the full structure of HOR arrays.

METHODS

We applied the novel high-precision GRMhor algorithm to the complete T2T-CHM13 assembly of human chromosome 21. GRMhor integrates global repeat map (GRM) and monomer distance (MD) diagrams to accurately identify, classify, and visualize HORs and their subfragments.

RESULTS

The analysis revealed a novel Cascading 11mer HOR array, in which each canonical HOR copy comprises 11 monomers belonging to 10 different monomer types. Subfragments with periodicities of 4, 7, 9, and 20 were identified within the array. A second, complex 23/25mer HOR array of mixed Willard's/Cascading type was also detected. In contrast to the hg38 assembly, where a dominant 8mer and 33mer HOR were previously annotated, these structures were absent in the T2T-CHM13 assembly, highlighting the limitations of hg38. Notably, we discovered a novel 52mer HOR-the longest alpha satellite HOR unit reported in the human genome to date. Several subfragment repeats correspond to alphoid subfamilies previously identified using restriction enzyme digestion, but are here resolved with higher structural precision.

CONCLUSIONS

Our findings demonstrate the power of GRMhor in resolving complex and previously undetected alpha satellite architectures, including the longest canonical HOR unit identified in the human genome. The precise delineation of superHORs, Cascading structures, and HOR subfragments provides unprecedented insight into the fine-scale organization of the centromeric region of chromosome 21. These results highlight both the inadequacy of earlier assemblies, such as hg38, and the critical importance of complete telomere-to-telomere assemblies for accurately characterizing centromeric DNA.

摘要

背景

着丝粒α卫星DNA被组织成高阶重复序列(HORs),其精确结构在标准基因组组装中往往难以解析。最近人类基因组的端粒到端粒(T2T)组装使得能够对包括HOR阵列完整结构在内的着丝粒区域进行全面分析。

方法

我们将新颖的高精度GRMhor算法应用于人类21号染色体的完整T2T-CHM13组装。GRMhor整合全局重复图谱(GRM)和单体距离(MD)图,以准确识别、分类和可视化HOR及其亚片段。

结果

分析揭示了一种新型的级联11聚体HOR阵列,其中每个典型HOR拷贝包含属于10种不同单体类型的11个单体。在阵列中鉴定出了周期为4、7、9和20的亚片段。还检测到了第二种复杂的混合威拉德氏/级联类型的23/25聚体HOR阵列。与之前注释有主导8聚体和33聚体HOR的hg38组装不同,这些结构在T2T-CHM13组装中不存在,凸显了hg38的局限性。值得注意的是,我们发现了一种新型的52聚体HOR——这是迄今为止人类基因组中报道的最长的α卫星HOR单元。几个亚片段重复与先前通过限制性酶切鉴定的α卫星亚家族相对应,但在这里以更高的结构精度得到了解析。

结论

我们的研究结果证明了GRMhor在解析复杂且先前未检测到的α卫星结构方面的能力,包括在人类基因组中鉴定出的最长典型HOR单元。对超级HOR、级联结构和HOR亚片段的精确描绘为21号染色体着丝粒区域的精细组织提供了前所未有的见解。这些结果既凸显了早期组装(如hg38)的不足,也强调了完整的端粒到端粒组装对于准确表征着丝粒DNA的至关重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08d4/12385485/0cf20bbc16d1/genes-16-00885-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验