Oualkacha Karim, Labbe Aurelie, Ciampi Antonio, Roy Marc-Andre, Maziade Michel
University McGill.
Stat Appl Genet Mol Biol. 2012 Jan 6;11(2):/j/sagmb.2012.11.issue-2/1544-6115.1711/1544-6115.1711.xml. doi: 10.2202/1544-6115.1711.
For many complex disorders, genetically relevant disease definition is still unclear. For this reason, researchers tend to collect large numbers of items related directly or indirectly to the disease diagnostic. Since the measured traits may not be all influenced by genetic factors, researchers are faced with the problem of choosing which traits or combinations of traits to consider in linkage analysis. To combine items, one can subject the data to a principal component analysis. However, when family date are collected, principal component analysis does not take family structure into account. In order to deal with these issues, Ott & Rabinowitz (1999) introduced the principal components of heritability (PCH), which capture the familial information across traits by calculating linear combinations of traits that maximize heritability. The calculation of the PCHs is based on the estimation of the genetic and the environmental components of variance. In the genetic context, the standard estimators of the variance components are Lange's maximum likelihood estimators, which require complex numerical calculations. The objectives of this paper are the following: i) to review some standard strategies available in the literature to estimate variance components for unbalanced data in mixed models; ii) to propose an ANOVA method for a genetic random effect model to estimate the variance components, which can be applied to general pedigrees and high dimensional family data within the PCH framework; iii) to elucidate the connection between PCH analysis and Linear Discriminant Analysis. We use computer simulations to show that the proposed method has similar asymptotic properties as Lange's method when the number of traits is small, and we study the efficiency of our method when the number of traits is large. A data analysis involving schizophrenia and bipolar quantitative traits is finally presented to illustrate the PCH methodology.
对于许多复杂疾病,与遗传相关的疾病定义仍不明确。因此,研究人员倾向于收集大量直接或间接与疾病诊断相关的项目。由于所测量的性状可能并非都受遗传因素影响,研究人员面临着在连锁分析中选择考虑哪些性状或性状组合的问题。为了合并项目,可以对数据进行主成分分析。然而,在收集家系数据时,主成分分析没有考虑家系结构。为了解决这些问题,奥尔特和拉比诺维茨(1999年)引入了遗传力主成分(PCH),它通过计算使遗传力最大化的性状线性组合来捕捉跨性状的家系信息。PCH的计算基于方差的遗传和环境成分估计。在遗传背景下,方差成分的标准估计量是兰格的最大似然估计量,这需要复杂的数值计算。本文的目标如下:i)回顾文献中可用于估计混合模型中不平衡数据方差成分的一些标准策略;ii)为遗传随机效应模型提出一种方差分析方法来估计方差成分,该方法可应用于PCH框架内的一般家系和高维家系数据;iii)阐明PCH分析与线性判别分析之间的联系。我们通过计算机模拟表明,当性状数量较少时,所提出的方法具有与兰格方法相似的渐近性质,并且我们研究了性状数量较多时我们方法的效率。最后给出了一项涉及精神分裂症和双相情感障碍定量性状的数据分析,以说明PCH方法。