Gebrehiwot Netsanet Z, Aliloo Hassan, Strucken Eva M, Marshall Karen, Al Kalaldeh Mohammad, Missohou Ayao, Gibson John P
Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia.
International Livestock Research Institute and Centre for Tropical Livestock Genetics and Health, Nairobi, Kenya.
Front Genet. 2021 Mar 23;12:584355. doi: 10.3389/fgene.2021.584355. eCollection 2021.
Several studies have evaluated computational methods that infer the haplotypes from population genotype data in European cattle populations. However, little is known about how well they perform in African indigenous and crossbred populations. This study investigates: (1) global and local ancestry inference; (2) heterozygosity proportion estimation; and (3) genotype imputation in West African indigenous and crossbred cattle populations. Principal component analysis (PCA), ADMIXTURE, and LAMP-LD were used to analyse a medium-density single nucleotide polymorphism (SNP) dataset from Senegalese crossbred cattle. Reference SNP data of East and West African indigenous and crossbred cattle populations were used to investigate the accuracy of imputation from low to medium-density and from medium to high-density SNP datasets using Minimac v3. The first two principal components differentiated from European and African from other breeds. Irrespective of assuming two or three ancestral breeds for the Senegalese crossbreds, breed proportion estimates from ADMIXTURE and LAMP-LD showed a high correlation ( ≥ 0.981). The observed ancestral origin heterozygosity proportion in putative F1 crosses was close to the expected value of 1.0, and clearly differentiated F1 from all other crosses. The imputation accuracies (estimated as correlation) between imputed and the real data in crossbred animals ranged from 0.142 to 0.717 when imputing from low to medium-density, and from 0.478 to 0.899 for imputation from medium to high-density. The imputation accuracy was generally higher when the reference data came from the same geographical region as the target population, and when crossbred reference data was used to impute crossbred genotypes. The lowest imputation accuracies were observed for indigenous breed genotypes. This study shows that ancestral origin heterozygosity can be estimated with high accuracy and will be far superior to the use of observed individual heterozygosity for estimating heterosis in African crossbred populations. It was not possible to achieve high imputation accuracy in West African crossbred or indigenous populations based on reference data sets from East Africa, and population-specific genotyping with high-density SNP assays is required to improve imputation.
多项研究评估了从欧洲牛群的群体基因型数据推断单倍型的计算方法。然而,对于这些方法在非洲本土牛群和杂交牛群中的表现如何,人们知之甚少。本研究调查了:(1)全局和局部祖先推断;(2)杂合度比例估计;以及(3)西非本土和杂交牛群中的基因型填充。主成分分析(PCA)、ADMIXTURE和LAMP-LD被用于分析来自塞内加尔杂交牛的中密度单核苷酸多态性(SNP)数据集。利用东非和西非本土及杂交牛群的参考SNP数据,使用Minimac v3研究从低密度到中密度以及从中密度到高密度SNP数据集的填充准确性。前两个主成分将欧洲牛和非洲牛与其他品种区分开来。无论假设塞内加尔杂交牛有两个还是三个祖先品种,ADMIXTURE和LAMP-LD估计的品种比例都显示出高度相关性(≥0.981)。在假定的F1杂交后代中观察到的祖先起源杂合度比例接近预期值1.0,并且明显将F1与所有其他杂交后代区分开来。从低密度到中密度填充时,杂交动物中填充数据与真实数据之间的填充准确性(以相关性估计)范围为0.142至0.717,从中密度到高密度填充时为0.478至0.899。当参考数据来自与目标群体相同的地理区域,以及使用杂交参考数据填充杂交基因型时,填充准确性通常更高。本土品种基因型的填充准确性最低。本研究表明,祖先起源杂合度可以高精度估计,并且在估计非洲杂交牛群的杂种优势方面将远优于使用观察到的个体杂合度。基于来自东非的参考数据集,在西非杂交或本土牛群中无法实现高填充准确性,需要进行群体特异性的高密度SNP检测基因分型以提高填充准确性。