Luo Ling-Yun, Wu Hui, Zhao Li-Ming, Zhang Ya-Hui, Huang Jia-Hui, Liu Qiu-Yue, Wang Hai-Tao, Mo Dong-Xin, EEr He-Hua, Zhang Lian-Quan, Chen Hai-Liang, Jia Shan-Gang, Wang Wei-Min, Li Meng-Hua
Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China.
State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs; Engineering Research Center of Grassland Industry, Ministry of Education; College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, China.
Nat Genet. 2025 Jan;57(1):218-230. doi: 10.1038/s41588-024-02037-6. Epub 2025 Jan 8.
Ongoing efforts to improve sheep reference genome assemblies still leave many gaps and incomplete regions, resulting in a few common failures and errors in genomic studies. Here, we report a 2.85-Gb gap-free telomere-to-telomere genome of a ram (T2T-sheep1.0), including all autosomes and the X and Y chromosomes. This genome adds 220.05 Mb of previously unresolved regions and 754 new genes to the most updated reference assembly ARS-UI_Ramb_v3.0; it contains four types of repeat units (SatI, SatII, SatIII and CenY) in centromeric regions. T2T-sheep1.0 has a base accuracy of more than 99.999%, corrects several structural errors in previous reference assemblies and improves structural variant detection in repetitive sequences. Alignment of whole-genome short-read sequences of global domestic and wild sheep against T2T-sheep1.0 identifies 2,664,979 new single-nucleotide polymorphisms in previously unresolved regions, which improves the population genetic analyses and detection of selective signals for domestication (for example, ABCC4) and wool fineness (for example, FOXQ1).
目前为改进绵羊参考基因组组装所做的努力仍然留下了许多缺口和不完整区域,导致基因组研究中出现一些常见的失败和错误。在此,我们报告了一只公羊的28.5亿碱基对无缺口的端粒到端粒基因组(T2T-绵羊1.0),包括所有常染色体以及X和Y染色体。该基因组在最新的参考组装ARS-UI_Ramb_v3.0的基础上增加了2.2005亿碱基对以前未解析的区域和754个新基因;它在着丝粒区域包含四种类型的重复单元(SatI、SatII、SatIII和CenY)。T2T-绵羊1.0的碱基准确率超过99.999%,纠正了先前参考组装中的几个结构错误,并改进了重复序列中结构变异的检测。将全球家养和野生绵羊的全基因组短读长序列与T2T-绵羊1.0进行比对,在以前未解析的区域中鉴定出2664979个新的单核苷酸多态性,这改进了群体遗传分析以及对驯化(例如ABCC4)和羊毛细度(例如FOXQ1)的选择信号的检测。