使用数据增强方法从截尾表型估计遗传力和育种值。

Estimating Heritabilities and Breeding Values From Censored Phenotypes Using a Data Augmentation Approach.

作者信息

Stephen Melissa A, Cheng Hao, Pryce Jennie E, Burke Chris R, Steele Nicole M, Phyn Claire V C, Garrick Dorian J

机构信息

DairyNZ Ltd., Hamilton, New Zealand.

AL Rae Centre for Genetics and Breeding-Massey University, Hamilton, New Zealand.

出版信息

Front Genet. 2022 Jul 25;13:867152. doi: 10.3389/fgene.2022.867152. eCollection 2022.

DOI:10.3389/fgene.2022.867152

PMID:35957692

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9358037/

Abstract

Time-dependent traits are often subject to censorship, where instead of precise phenotypes, only a lower and/or upper bound can be established for some of the individuals. Censorship reduces the precision of phenotypes but can represent compromise between measurement cost and animal ethics considerations. This compromise is particularly relevant for genetic evaluation because phenotyping initiatives often involve thousands of individuals. This research aimed to: 1) demonstrate a data augmentation approach for analysing censored phenotypes, and 2) quantify the implications of phenotype censorship on estimation of heritabilities and predictions of breeding values. First, we simulated uncensored phenotypes, representing fine-scale "age at puberty" for each individual in a population of some 5,000 animals across 50 herds. Analysis of these uncensored phenotypes provided a gold-standard control. We then produced seven "test" phenotypes by superimposing varying degrees of left, interval, and/or right censorship, as if herds were measured on only one, two or three occasions, with a binary measure categorized for animals at each visit (either pre or post pubertal). We demonstrated that our estimates of heritabilities and predictions of breeding values obtained using a data augmentation approach were remarkably robust to phenotype censorship. Our results have important practical implications for measuring time-dependent traits for genetic evaluation. More specifically, we suggest that data collection can be designed with relatively infrequent repeated measures, thereby reducing costs and increasing feasibility across large numbers of animals.

摘要

随时间变化的性状常常受到删失的影响，即对于某些个体，无法确定精确的表型，只能确定一个下限和/或上限。删失降低了表型的精确性，但这可能是测量成本与动物伦理考量之间的一种折衷。这种折衷对于遗传评估尤为重要，因为表型分析计划通常涉及数千个个体。本研究旨在：1）展示一种用于分析删失表型的数据增强方法，以及2）量化表型删失对遗传力估计和育种值预测的影响。首先，我们模拟了未删失的表型，代表约50个牛群中5000只动物群体中每个个体的精细“青春期年龄”。对这些未删失表型的分析提供了一个金标准对照。然后，我们通过叠加不同程度的左删失、区间删失和/或右删失产生了七种“测试”表型，就好像牛群只在一、二或三次测量时进行测量，每次测量时对动物进行二元分类（青春期前或青春期后）。我们证明，使用数据增强方法获得的遗传力估计和育种值预测对表型删失具有显著的稳健性。我们的结果对于测量用于遗传评估的随时间变化的性状具有重要的实际意义。更具体地说，我们建议可以设计数据收集，采用相对不频繁的重复测量，从而降低成本并提高对大量动物测量的可行性。