State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.
Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China.
Mol Ecol Resour. 2021 Jan;21(1):201-211. doi: 10.1111/1755-0998.13236. Epub 2020 Aug 17.
Yak is an important livestock animal for the people indigenous to the harsh, oxygen-limited Qinghai-Tibetan Plateau and Hindu Kush ranges of the Himalayas. The yak genome was sequenced in 2012, but its assembly was fragmented because of the inherent limitations of the Illumina sequencing technology used to analyse it. An accurate and complete reference genome is essential for the study of genetic variations in this species. Long-read sequences are more complete than their short-read counterparts and have been successfully applied towards high-quality genome assembly for various species. In this study, we present a high-quality chromosome-scale yak genome assembly (BosGru_PB_v1.0) constructed with long-read sequencing and chromatin interaction technologies. Compared to an existing yak genome assembly (BosGru_v2.0), BosGru_PB_v1.0 shows substantially improved chromosome sequence continuity, reduced repetitive structure ambiguity, and gene model completeness. To characterize genetic variation in yak, we generated de novo genome assemblies based on Illumina short reads for seven recognized domestic yak breeds in Tibet and Sichuan and one wild yak from Hoh Xil. We compared these eight assemblies to the BosGru_PB_v1.0 genome, obtained a comprehensive map of yak genetic diversity at the whole-genome level, and identified several protein-coding genes absent from the BosGru_PB_v1.0 assembly. Despite the genetic bottleneck experienced by wild yak, their diversity was nonetheless higher than that of domestic yak. Here, we identified breed-specific sequences and genes by whole-genome alignment, which may facilitate yak breed identification.
牦牛是青藏高原和喜马拉雅山脉高寒、缺氧地区原住民的重要家畜。2012 年,牦牛基因组测序完成,但由于用于分析的 Illumina 测序技术的固有限制,其组装结果较为碎片化。一个准确、完整的参考基因组对于研究该物种的遗传变异至关重要。长读长序列比短读长序列更完整,并已成功应用于多种物种的高质量基因组组装。在这项研究中,我们使用长读长测序和染色质互作技术,构建了一个高质量的染色体级别的牦牛基因组组装(BosGru_PB_v1.0)。与现有的牦牛基因组组装(BosGru_v2.0)相比,BosGru_PB_v1.0 显示出染色体序列连续性显著提高、重复结构不确定性降低和基因模型完整性增强。为了研究牦牛的遗传变异,我们基于来自西藏和四川的 7 个已识别的家养牦牛品种和来自可可西里的 1 个野生牦牛的 Illumina 短读长生成了从头组装的基因组。我们将这 8 个组装与 BosGru_PB_v1.0 基因组进行比较,获得了整个基因组水平的牦牛遗传多样性的综合图谱,并鉴定出一些缺失于 BosGru_PB_v1.0 组装的蛋白质编码基因。尽管野生牦牛经历了遗传瓶颈,但它们的多样性仍高于家养牦牛。在这里,我们通过全基因组比对鉴定了品种特异性序列和基因,这可能有助于牦牛品种鉴定。