Laurin Charles, Cuellar-Partida Gabriel, Hemani Gibran, Smith George Davey, Yang Jian, Evans David M
MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.
Faculty of Medicine, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, QLD, Australia.
Behav Genet. 2018 Jan;48(1):67-79. doi: 10.1007/s10519-017-9880-0. Epub 2017 Nov 2.
We propose a new method, G-REMLadp, to estimate the phenotypic variance explained by parent-of-origin effects (POEs) across the genome. Our method uses restricted maximum likelihood analysis of genome-wide genetic relatedness matrices based on individuals' phased genotypes. Genome-wide SNP data from parent child duos or trios is required to obtain relatedness matrices indexing the parental origin of offspring alleles, as well as offspring phenotype data to partition the trait variation into variance components. To calibrate the power of G-REMLadp to detect non-null POEs when they are present, we provide an analytic approximation derived from Haseman-Elston regression. We also used simulated data to quantify the power and Type I Error rates of G-REMLadp, as well as the sensitivity of its variance component estimates to violations of underlying assumptions. We subsequently applied G-REMLadp to 36 phenotypes in a sample of individuals from the Avon Longitudinal Study of Parents and Children (ALSPAC). We found that the method does not seem to be inherently biased in estimating variance due to POEs, and that substantial correlation between parental genotypes is necessary to generate biased estimates. Our empirical results, power calculations and simulations indicate that sample sizes over 10000 unrelated parent-offspring duos will be necessary to detect POEs explaining < 10% of the variance with moderate power. We conclude that POEs tagged by our genetic relationship matrices are unlikely to explain large proportions of the phenotypic variance (i.e. > 15%) for the 36 traits that we have examined.
我们提出了一种新方法G-REMLadp,用于估计全基因组中由亲本来源效应(POE)解释的表型方差。我们的方法基于个体的相位基因型,对全基因组遗传相关矩阵进行限制最大似然分析。需要来自亲子二人组或三人组的全基因组SNP数据来获得索引后代等位基因亲本来源的相关矩阵,以及后代表型数据,以便将性状变异划分为方差成分。为了校准G-REMLadp在存在非零POE时检测它们的能力,我们提供了一种源自Haseman-Elston回归的解析近似值。我们还使用模拟数据来量化G-REMLadp的功效和I型错误率,以及其方差成分估计对违反基本假设的敏感性。随后,我们将G-REMLadp应用于来自阿冯父母与儿童纵向研究(ALSPAC)的个体样本中的36种表型。我们发现,该方法在估计由POE引起的方差时似乎没有内在偏差,并且亲本基因型之间的实质性相关性对于产生偏差估计是必要的。我们的实证结果、功效计算和模拟表明,需要超过10000个无关亲子二人组的样本量才能以中等功效检测到解释方差小于10%的POE。我们得出结论,我们的遗传关系矩阵所标记的POE不太可能解释我们所研究的36个性状的大部分表型方差(即>15%)。