Suppr超能文献

猪基因组评估中基因型、表型和系谱截断的界限。

Boundaries for genotype, phenotype, and pedigree truncation in genomic evaluations in pigs.

机构信息

Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA.

Genus PIC, Hendersonville, TN 37075, USA.

出版信息

J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad273.

Abstract

Historical data collection for genetic evaluation purposes is a common practice in animal populations; however, the larger the dataset, the higher the computing power needed to perform the analyses. Also, fitting the same model to historical and recent data may be inappropriate. Data truncation can reduce the number of equations to solve, consequently decreasing computing costs; however, the large volume of genotypes is responsible for most of the increase in computations. This study aimed to assess the impact of removing genotypes along with phenotypes and pedigree on the computing performance, reliability, and inflation of genomic predicted breeding value (GEBV) from single-step genomic best linear unbiased predictor for selection candidates. Data from two pig lines, a terminal sire (L1) and a maternal line (L2), were analyzed in this study. Four analyses were implemented: growth and "weaning to finish" mortality on L1, pre-weaning and reproductive traits on L2. Four genotype removal scenarios were proposed: removing genotyped animals without phenotypes and progeny (noInfo), removing genotyped animals based on birth year (Age), the combination of noInfo and Age scenarios (noInfo + Age), and no genotype removal (AllGen). In all scenarios, phenotypes were removed, based on birth year, and three pedigree depths were tested: two and three generations traced back and using the entire pedigree. The full dataset contained 1,452,257 phenotypes for growth traits, 324,397 for weaning to finish mortality, 517,446 for pre-weaning traits, and 7,853,629 for reproductive traits in pure and crossbred pigs. Pedigree files for lines L1 and L2 comprised 3,601,369 and 11,240,865 animals, of which 168,734 and 170,121 were genotyped, respectively. In each truncation scenario, the linear regression method was used to assess the reliability and dispersion of GEBV for genotyped parents (born after 2019). The number of years of data that could be removed without harming reliability depended on the number of records, type of analyses (multitrait vs. single trait), the heritability of the trait, and data structure. All scenarios had similar reliabilities, except for noInfo, which performed better in the growth analysis. Based on the data used in this study, considering the last ten years of phenotypes, tracing three generations back in the pedigree, and removing genotyped animals not contributing own or progeny phenotypes, increases computing efficiency with no change in the ability to predict breeding values.

摘要

用于遗传评估目的的历史数据收集在动物群体中很常见;然而,数据集越大,执行分析所需的计算能力就越高。此外,将相同的模型拟合到历史数据和近期数据上可能并不合适。数据截断可以减少需要求解的方程数量,从而降低计算成本;然而,基因型的大量增加是计算量增加的主要原因。本研究旨在评估从单步基因组最佳线性无偏预测器中为选择候选者去除基因型以及表型和系谱对计算性能、可靠性和基因组预测育种值 (GEBV) 膨胀的影响。本研究分析了两个猪品系的数据,一个终端种猪(L1)和一个母系(L2)。实施了四项分析:L1 的生长和“断奶至育肥”死亡率、L2 的断奶前和繁殖性状。提出了四种基因型去除方案:去除无表型和后代的基因型动物(无信息)、根据出生年份去除基因型动物(年龄)、无信息和年龄方案的组合(无信息+年龄)和不进行基因型去除(全部基因型)。在所有方案中,根据出生年份去除表型,并测试了三种系谱深度:追溯两代和三代以及使用整个系谱。完整数据集包含 1452257 个生长性状表型、324397 个断奶至育肥死亡率表型、517446 个断奶前性状表型和 7853629 个繁殖性状表型,涉及纯种和杂交猪。L1 和 L2 的系谱文件分别包含 3601369 头和 11240865 头动物,其中分别有 168734 头和 170121 头进行了基因型检测。在每个截断方案中,使用线性回归方法评估了经过 2019 年后出生的基因型父母的 GEBV 的可靠性和分散性。不损害可靠性的情况下可以删除的年份数据数量取决于记录数量、分析类型(多性状与单性状)、性状的遗传力和数据结构。除了无信息方案在生长分析中表现更好之外,所有方案的可靠性都相似。基于本研究使用的数据,考虑最后十年的表型,在系谱中追溯三代,去除没有自身或后代表型的基因型动物,可以提高计算效率,而不会影响预测育种值的能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdfd/10464514/ea42fe254e7f/skad273_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验