Milia Sotiria, Leonard Alexander S, Mapel Xena Marie, Bernal Ulloa Sandra Milena, Drögemüller Cord, Pausch Hubert
Animal Genomics, ETH Zurich, Zurich 8092, Switzerland.
Animal Physiology, ETH Zurich, Zurich 8092, Switzerland.
Genome Res. 2025 Apr 14;35(4):1041-1052. doi: 10.1101/gr.279064.124.
Cattle have been selectively bred for coat color, spotting, and depigmentation patterns. The assumed autosomal dominant inherited genetic variants underlying the characteristic white head of Fleckvieh, Simmental, and Hereford cattle have not been identified yet, although the contribution of structural variation upstream of the gene has been proposed. Here, we construct a graph pangenome from 24 haplotype assemblies representing seven taurine cattle breeds to identify and characterize the white-head-associated locus for the first time based on long-read sequencing data and pangenome analyses. We introduce a pangenome-wide association mapping approach that examines assembly path similarities within the graph to reveal an association between two most likely serial alleles of a complex structural variant (SV) 66 kb upstream of and facial depigmentation. The complex SV contains a variable number of tandemly duplicated 14.3 kb repeats, consisting of LTRs, LINEs, and other repetitive elements, leading to misleading alignments of short and long reads when using a linear reference. We align 250 short-read sequencing samples spanning 15 cattle breeds to the pangenome graph, further validating that the alleles of the SV segregate with head depigmentation. We estimate an increased count of repeats in Hereford relative to Simmental and other white-headed cattle breeds from the graph alignment coverage, suggesting a large under-assembly in the current Hereford-based cattle reference genome, which had fewer copies. Our work shows that exploiting assembly path similarities within graph pangenomes can reveal trait-associated complex SVs.
牛已被选择性培育出不同的毛色、斑点和色素脱失模式。尽管有人提出基因上游结构变异的作用,但尚未确定弗莱维赫牛、西门塔尔牛和赫里福德牛特征性白头背后假定的常染色体显性遗传基因变异。在此,我们基于长读长测序数据和泛基因组分析,从代表七个普通牛品种的24个单倍型组装构建了一个图形泛基因组,首次识别并表征与白头相关的基因座。我们引入了一种全泛基因组关联映射方法,该方法检查图形内的组装路径相似性,以揭示基因上游66 kb处一个复杂结构变异(SV)的两个最可能的串联等位基因与面部色素脱失之间的关联。该复杂SV包含可变数量的串联重复14.3 kb序列,由长末端重复序列(LTR)、长散在核元件(LINE)和其他重复元件组成,使用线性参考基因组时会导致短读长和长读长的错误比对。我们将跨越15个牛品种的250个短读长测序样本与泛基因组图形进行比对,进一步验证了该SV的等位基因与头部色素脱失相关。我们从图形比对覆盖度估计,赫里福德牛相对于西门塔尔牛和其他白头牛品种的重复序列计数增加,这表明当前基于赫里福德牛的牛参考基因组组装严重不足,其拷贝数较少。我们的工作表明,利用图形泛基因组内的组装路径相似性可以揭示与性状相关的复杂SV。