Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan.
DNA Res. 2021 Oct 11;28(6). doi: 10.1093/dnares/dsab021.
The complete sequencing of human centromeres, which are filled with highly repetitive elements, has long been challenging. In human centromeres, α-satellite monomers of about 171 bp in length are the basic repeating units, but α-satellite monomers constitute the higher-order repeat (HOR) units, and thousands of copies of highly homologous HOR units form large arrays, which have hampered sequence assembly of human centromeres. Because most HOR unit occurrences are covered by long reads of about 10 kb, the recent availability of much longer reads is expected to enable observation of individual HOR occurrences in terms of their single-nucleotide or structural variants. The time has come to examine the complete sequence of human centromeres.
人类着丝粒的完全测序一直具有挑战性,因为它们充满了高度重复的元件。在人类着丝粒中,大约 171bp 长的α卫星单体是基本重复单元,但α卫星单体构成了更高阶重复(HOR)单元,数千个高度同源的 HOR 单元形成了大型阵列,这阻碍了人类着丝粒的序列组装。由于大多数 HOR 单元的出现都被大约 10kb 的长reads 所覆盖,因此最近出现的更长 reads 有望使人们能够观察到单个 HOR 出现的单核苷酸或结构变体。现在是时候检查人类着丝粒的完整序列了。