Gao Ning, Teng Jinyan, Ye Shaopan, Yuan Xiaolong, Huang Shuwen, Zhang Hao, Zhang Xiquan, Li Jiaqi, Zhang Zhe
National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
Front Genet. 2018 Aug 31;9:364. doi: 10.3389/fgene.2018.00364. eCollection 2018.
In the last years, a series of methods for genomic prediction (GP) have been established, and the advantages of GP over pedigree best linear unbiased prediction (BLUP) have been reported. However, the majority of previously proposed GP models are purely based on mathematical considerations while seldom take the abundant biological knowledge into account. Prediction ability of those models largely depends on the consistency between the statistical assumptions and the underlying genetic architectures of traits of interest. In this study, gene annotation information was incorporated into GP models by constructing haplotypes with SNPs mapped to genic regions. Haplotype allele similarity between pairs of individuals was measured through different approaches at single gene level and then converted into whole genome level, which was then treated as a special kernel and used in kernel based GP models. Results shown that the gene annotation guided methods gave higher or at least comparable predictive ability in some traits, especially in the Arabidopsis dataset and the rice breeding population. Compared to SNP models and haplotype models without gene annotation, the gene annotation based models improved the predictive ability by 0.5626.67% in the Arabidopsis and 1.6216.53% in the rice breeding population, respectively. However, incorporating gene annotation slightly improved the predictive ability for several traits but did not show any extra gain for the rest traits in a chicken population. In conclusion, integrating gene annotation into GP models could be beneficial for some traits, species, and populations compared to SNP models and haplotype models without gene annotation. However, more studies are yet to be conducted to implicitly investigate the characteristics of these gene annotation guided models.
在过去几年中,一系列基因组预测(GP)方法已经建立,并且已有报道称GP相对于系谱最佳线性无偏预测(BLUP)具有优势。然而,大多数先前提出的GP模型纯粹基于数学考量,很少考虑丰富的生物学知识。这些模型的预测能力在很大程度上取决于统计假设与目标性状潜在遗传结构之间的一致性。在本研究中,通过构建将单核苷酸多态性(SNP)映射到基因区域的单倍型,将基因注释信息纳入GP模型。通过不同方法在单基因水平测量个体对之间的单倍型等位基因相似性,然后转换为全基因组水平,将其作为特殊核并用于基于核的GP模型。结果表明,基因注释引导的方法在某些性状上具有更高或至少相当的预测能力,特别是在拟南芥数据集和水稻育种群体中。与无基因注释的SNP模型和单倍型模型相比,基于基因注释的模型在拟南芥中预测能力提高了0.56%至26.67%,在水稻育种群体中提高了1.62%至16.53%。然而,在鸡群体中,纳入基因注释对几个性状的预测能力略有提高,但对其余性状没有显示出任何额外的提升。总之,与无基因注释的SNP模型和单倍型模型相比,将基因注释整合到GP模型中可能对某些性状、物种和群体有益。然而,还需要进行更多研究来深入探究这些基因注释引导模型的特征。