ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, 243122, India.
ICAR-National Bureau of Animal Genetic Resources, Karnal, Haryana, 132001, India.
BMC Genomics. 2024 Nov 5;25(1):1043. doi: 10.1186/s12864-024-10924-9.
The current investigation was undertaken to elucidate the population-stratifying and ancestry-informative markers in Indian, Chinese, and wild yak populations using whole genome resequencing (WGS) analysis while employing various selection strategies (Delta, Pairwise Wright's Fixation Index-F, and Informativeness of Assignment) and marker densities (5-25 thousand). The study used WGS data on 105 individuals from three separate yak cohorts i.e., Indian yak (n = 29), Chinese yak (n = 61), and wild yak (n = 15). Variant calling in the GATK program with strict quality control resulted in 1,002,970 high-quality and independent (LD-pruned) SNP markers across the yak autosomes. Analysis was undertaken in toolbox for ranking and evaluation of SNPs (TRES) program wherein three different criteria i.e., Delta, Pairwise Wright's Fixation Index-F, and Informativeness of Assignment were employed to identify population-stratifying and ancestry-informative markers across various datasets. The top-ranked 5,000 (5K), 10,000 (10K), 15,000 (15K), 20,000 (20K), and 25,000 (25K) SNPs were identified from each dataset while their composition and performance was assessed using different criteria. The average genomic breed clustering of Indian, Chinese, and wild yak cohorts with full density dataset (105 individuals with 1,002,970 markers) was 81.74%, 80.02%, and 83.62%, respectively. Informativeness of Assignment criterion with 10K density emerged as the best combination for three yak cohorts with 86.94%, 96.46%, and 98.20% clustering for Indian, Chinese, and wild yak, respectively. There was an average increase of 7.56%, 22.72%, and 30.35% in genomic breed clustering scores of Indian, Chinese, and wild yak cohorts over the estimates of the original dataset. The selected markers showed overlap multiple protein-coding genes within a 10 kb window including ADGRB3, ANK1, CACNG7, CALN1, CHCHD2, CREBBP, GLI3, KHDRBS2, and OSBPL10. This is the first report ever on elucidating low-density SNP marker sets with population-stratifying and ancestry-informative properties in three yak groups using WGS data. The results gain significance for application of genomic selection using cost-effective low-density SNP panels in global yak species.
本研究采用全基因组重测序(WGS)分析,结合多种选择策略(Delta、Pairwise Wright's Fixation Index-F 和标记信息量赋值)和不同标记密度(5-25 千),对印度、中国和野生牦牛种群进行了群体分层和祖先信息标记的研究。本研究使用了来自三个不同牦牛群体的 105 个个体的 WGS 数据,即印度牦牛(n=29)、中国牦牛(n=61)和野生牦牛(n=15)。在 GATK 程序中进行严格质量控制的变异调用,在牦牛常染色体上产生了 1002970 个高质量和独立(LD 修剪)的 SNP 标记。在 TRES 程序中进行了分析,该程序采用了三种不同的标准,即 Delta、Pairwise Wright's Fixation Index-F 和标记信息量赋值,以识别不同数据集的群体分层和祖先信息标记。从每个数据集都确定了排名前 5000(5K)、10000(10K)、15000(15K)、20000(20K)和 25000(25K)的 SNP,同时使用不同的标准评估了它们的组成和性能。使用全密度数据集(105 个个体,1002970 个标记)对印度、中国和野生牦牛群体的平均基因组品种聚类分别为 81.74%、80.02%和 83.62%。在信息量赋值标准下,10K 密度的表现最佳,对三个牦牛群体的聚类分别为 86.94%、96.46%和 98.20%,分别为印度、中国和野生牦牛。与原始数据集的估计相比,印度、中国和野生牦牛群体的基因组品种聚类评分平均分别增加了 7.56%、22.72%和 30.35%。选择的标记在 10kb 窗口内重叠了多个编码蛋白质的基因,包括 ADGRB3、ANK1、CACNG7、CALN1、CHCHD2、CREBBP、GLI3、KHDRBS2 和 OSBPL10。这是首次利用 WGS 数据在三个牦牛群体中阐明具有群体分层和祖先信息特性的低密度 SNP 标记集。这些结果对于使用具有成本效益的低密度 SNP 面板在全球牦牛物种中进行基因组选择的应用具有重要意义。