Gao Yahui, Yang Liu, Kuhn Kristen, Li Wenli, Zanton Geoffrey, Bowman Mary, Zhao Pengju, Zhou Yang, Fang Lingzhao, Cole John B, Rosen Benjamin D, Ma Li, Li Congjun, Baldwin Ransom L, Van Tassell Curtis P, Zhang Zhe, Smith Timothy P L, Liu George E
State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA; Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA; Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
J Adv Res. 2025 Apr 19. doi: 10.1016/j.jare.2025.04.014.
Most SV studies in livestock rely on short-read sequencing, posing challenges in accurately characterizing large genomic variants due to their limited read length.
Our goal is to reveal structural variation and novel sequences specific to Holstein and Jersey cattle breeds using long-read and pan-genome analyses.
We sequenced 20 Holsteins and 8 Jersey cattle using PacBio HiFi to 20×, and integrated five read-based and one assembly-based SV caller to determine SVs.
We assembled the 28 genomes averaging 3.25 Gb with a contig N50 of 69.36 Mb and using the ARS-UCD1.2 reference, we acquired Holstein/Jersey SV catalogs with 74,068/54,689 events spanning 202/135 Mb (7.43 %/4.97 % of the genome). SVs were enriched in less conserved, non-coding, and non-regulatory regions. Comparing Holsteins with differing feed efficiency (FE), SVs unique to high FE were linked to energy metabolism and olfactory receptors, while those specific to low FE were associated with material transport. We constructed Holstein/Jersey pangenome graphs with 148,598/105,875 nodes and 208,891/147,990 edges, representing 47,028/37,137 biallelic and multi-allelic events, and 63.75/42.34 Mb of novel sequence. We observed SV count saturation with 20 Holsteins, while adding Jerseys significantly increased the SV count, highlighting breed-specific SV events.
Our long-read data and SV catalogs are valuable resources, revealing that the cattle genome is more complex than previously thought.
大多数家畜的结构变异(SV)研究依赖于短读长测序,由于读长有限,在准确表征大型基因组变异方面面临挑战。
我们的目标是通过长读长和泛基因组分析揭示荷斯坦奶牛和泽西奶牛品种特有的结构变异和新序列。
我们使用PacBio HiFi对20头荷斯坦奶牛和8头泽西奶牛进行了20倍深度测序,并整合了五种基于读段和一种基于组装的SV检测工具来确定SV。
我们组装了28个基因组,平均大小为3.25Gb,重叠群N50为69.36Mb。使用ARS-UCD1.2参考基因组,我们获得了荷斯坦/泽西SV目录,分别有74,068/54,689个事件,跨越202/135Mb(占基因组的7.43%/4.97%)。SV在保守性较低、非编码和非调控区域富集。比较不同饲料效率(FE)的荷斯坦奶牛,高FE特有的SV与能量代谢和嗅觉受体相关,而低FE特有的SV与物质运输相关。我们构建了荷斯坦/泽西泛基因组图谱,分别有148,598/105,875个节点和208,891/147,990条边,代表47,028/37,137个双等位基因和多等位基因事件,以及63.75/42.34Mb的新序列。我们观察到20头荷斯坦奶牛的SV计数达到饱和,而加入泽西奶牛显著增加了SV计数,突出了品种特异性SV事件。
我们的长读长数据和SV目录是宝贵的资源,揭示了牛基因组比以前认为的更复杂。