Zhao Zhida, Niu Qunhao, Wu Jiayuan, Wu Tianyi, Xie Xueyuan, Wang Zezhao, Zhang Lupei, Gao Huijiang, Gao Xue, Xu Lingyang, Zhu Bo, Li Junya
Key Laboratory of Animal Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
Northern Agriculture and Livestock Husbandry Technology Innovation Center, Hohhot, 010010, China.
Biol Direct. 2024 Dec 31;19(1):147. doi: 10.1186/s13062-024-00574-y.
Integrating multi-layered information can enhance the accuracy of genomic prediction for complex traits. However, the improvement and application of effective strategies for genomic prediction (GP) using multi-omics data remains challenging.
We generated 11 feature sets for sequencing variants from genomics, transcriptomics, metabolomics, and epigenetics data in beef cattle, then we assessed the contribution of functional variants using genomic restricted maximum likelihood (GREML). We next estimated and ranked variant scores for 43 economically important traits, and compared the prediction accuracy of the top and bottom sets using genomic best linear unbiased prediction (GBLUP) and BayesB model. In addition, we annotated the variants from GWAS with functional feature sets and performed enrichment analysis.
We observed significant enrichments for 32 functional categories in 11 feature sets. The evolutionary related sets (conservation regions and selection signatures) contributed significantly to heritability (31.78-fold and 14.48-fold enrichment), while metabolomics and transcriptomics showed low heritability enrichments. We observed a significant increase in prediction accuracy using the top feature set variants compared to whole-genome sequencing (WGS) data. The prediction accuracy based on the top 10% variant set showed an average increase of 11.6% and 7.54% using BayesB and GBLUP across traits, respectively. Notably, the greatest increase of 31.52% was obtained for spleen weight (SW) using BayesB. Also, we found that the top 10% of variants show strong enrichment with weight related QTLs based on the Cattle QTL database.
Our findings suggest that integrating biological prior information from multiple layers can enhance our understanding of the genetic architecture underlying complex traits and further improve genomic prediction in beef cattle.
整合多层信息可提高复杂性状基因组预测的准确性。然而,使用多组学数据的有效基因组预测(GP)策略的改进和应用仍然具有挑战性。
我们从肉牛的基因组学、转录组学、代谢组学和表观遗传学数据中生成了11个用于测序变异的特征集,然后使用基因组限制最大似然法(GREML)评估功能变异的贡献。接下来,我们估计并对43个经济重要性状的变异分数进行排名,并使用基因组最佳线性无偏预测(GBLUP)和贝叶斯B模型比较顶部和底部集合的预测准确性。此外,我们用功能特征集注释了全基因组关联研究(GWAS)中的变异,并进行了富集分析。
我们在11个特征集中观察到32个功能类别的显著富集。与进化相关的集合(保守区域和选择信号)对遗传力有显著贡献(富集倍数分别为31.78倍和14.48倍),而代谢组学和转录组学显示出较低的遗传力富集。与全基因组测序(WGS)数据相比,我们观察到使用顶部特征集变异的预测准确性显著提高。基于前10%变异集的预测准确性在各性状上使用贝叶斯B和GBLUP分别平均提高了11.6%和7.54%。值得注意的是,使用贝叶斯B对脾脏重量(SW)的预测准确性提高最大,为31.52%。此外,我们发现基于牛QTL数据库,前10%的变异与体重相关QTL有很强的富集。
我们的研究结果表明,整合来自多个层面的生物学先验信息可以增强我们对复杂性状潜在遗传结构的理解,并进一步提高肉牛的基因组预测。