Suppr超能文献

优化奶牛群体中因果基因的基因组预测模型。

Optimizing genomic prediction model given causal genes in a dairy cattle population.

机构信息

Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China.

State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, North Third Road, Guangzhou Higher Education Mega Center, Guangzhou 510006, China.

出版信息

J Dairy Sci. 2020 Nov;103(11):10299-10310. doi: 10.3168/jds.2020-18233. Epub 2020 Sep 18.

Abstract

As genotypic data are moving from SNP chip toward whole-genome sequence, the accuracy of genomic prediction (GP) exhibits a marginal gain, although all genetic variation, including causal genes, are contained in whole-genome sequence data. Meanwhile, genetic analyses on complex traits, such as genome-wide association studies, have identified an increasing number of genomic regions, including potential causal genes, which would be reliable prior knowledge for GP. Many studies have tried to improve the performance of GP by modifying the prediction model to incorporate prior knowledge. Although several plausible results have been obtained from model modification or strategy optimization, most of them were validated in a specific empirical population with a limited variety of genetic architecture for complex traits. An alternative approach is to use simulated genetic architecture with known causal genes (e.g., simulated causative SNP) to evaluate different GP models with given causal genes. Our objectives were to (1) evaluate the performance of GP under a variety of genetic architectures with a subset of known causal genes and (2) compare different GP models modified by highlighting causal genes and different strategies to weight causal genes. In this study, we simulated pseudo-phenotypes under a variety of genetic architectures based on the real genotypes and phenotypes of a dairy cattle population. Besides classical genomic best linear unbiased prediction, we evaluated 3 modified GP models that highlight causal genes as follows: (1) by treating them as fixed effects, (2) by treating them as a separate random component, and (3) by combining them into the genomic relationship matrix as random effects. Our results showed that highlighting the known causal genes, which explained a considerable proportion of genetic variance in the GP models, increased the predictive accuracy. Combining all given causal genes into the genomic relationship matrix was the optimal strategy under all the scenarios validated, and treating causal genes as a separate random component is also recommended, when more than 20% of genetic variance was explained by known causal genes. Moreover, assigning differential weights to each causal gene further improved the predictive accuracy.

摘要

随着基因型数据从 SNP 芯片向全基因组序列转移,基因组预测 (GP) 的准确性略有提高,尽管全基因组序列数据包含了所有的遗传变异,包括因果基因。与此同时,对复杂性状的遗传分析,如全基因组关联研究,已经确定了越来越多的基因组区域,包括潜在的因果基因,这些基因将是 GP 的可靠先验知识。许多研究试图通过修改预测模型来纳入先验知识来提高 GP 的性能。虽然通过模型修改或策略优化已经获得了一些合理的结果,但大多数结果都是在具有有限复杂性状遗传结构的特定实证群体中进行验证的。另一种方法是使用具有已知因果基因的模拟遗传结构(例如,模拟因果 SNP)来评估具有给定因果基因的不同 GP 模型。我们的目标是:(1)评估在具有已知因果基因子集的多种遗传结构下 GP 的性能,(2)比较通过突出因果基因和不同加权策略修改的不同 GP 模型。在这项研究中,我们根据奶牛群体的真实基因型和表型,在多种遗传结构下模拟了假表型。除了经典的基因组最佳线性无偏预测,我们还评估了 3 种修改后的 GP 模型,这些模型突出了因果基因,如下所示:(1)将其视为固定效应,(2)将其视为单独的随机分量,(3)将其组合到基因组关系矩阵中作为随机效应。我们的结果表明,突出已知因果基因,这些基因在 GP 模型中解释了相当大比例的遗传变异,提高了预测准确性。在所有验证的场景下,将所有给定的因果基因组合到基因组关系矩阵中是最佳策略,当超过 20%的遗传变异由已知因果基因解释时,将因果基因视为单独的随机分量也是推荐的策略。此外,对每个因果基因赋予不同的权重进一步提高了预测准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验