Carey Gregory
Department of Psychology and Institute for Behavioral Genetics, University of Colorado, Boulder, CO 80309-0345, USA.
Behav Genet. 2005 Sep;35(5):653-65. doi: 10.1007/s10519-005-5355-9.
Behavioral geneticists commonly parameterize a genetic or environmental covariance matrix as the product of a lower diagonal matrix postmultiplied by its transpose-a technique commonly referred to as "fitting a Cholesky." Here, simulations demonstrate that this procedure is sometimes valid, but at other times: (1) may not produce fit statistics that are distributed as a chi2; or (2) if the distribution of the fit statistic is chi2, then the degrees of freedom (df) are not always the difference between the number of parameters in the general model less the number of parameters in a constrained model. It is hypothesized that the problem is related to the fact that the Cholesky parameterization requires that the covariance matrix formed by the product be either positive definite or singular. Even though a population covariance matrix may be positive definite, the combination of sampling error and the derived--as opposed to directly observed--nature of genetic and environmental matrices allow matrices that are negative (semi) definite. When this occurs, fitting a Cholesky constrains the numerical area of search and compromises the maximum likelihood theory currently used in behavioral genetics. Until the reasons for this phenomenon are understood and satisfactory solutions are developed, those who fit Cholesky matrices face the burden of demonstrating the validity of their fit statistics and the df for model comparisons. An interim remedy is proposed--fit an unconstrained model and a Cholesky model, and if the two differ, then report the difference in fit statistics and parameter estimates. Cholesky problems are a matter of degree, not of kind. Thus, some Cholesky solutions will differ trivially from the unconstrained solutions, and the importance of the problems must be assessed by how often the two lead to different substantive interpretation of the results. If followed, the proposed interim remedy will develop a body of empirical data to assess the extent to which Cholesky problems are important substantive issues versus statistical curiosities.
行为遗传学家通常将遗传或环境协方差矩阵参数化为一个下三角矩阵与其转置矩阵后乘的乘积——这一技术通常被称为“拟合乔列斯基分解”。在此,模拟结果表明,该过程有时是有效的,但在其他时候:(1)可能不会产生服从卡方分布的拟合统计量;或者(2)如果拟合统计量的分布是卡方分布,那么自由度并不总是一般模型中的参数数量减去受限模型中的参数数量之差。据推测,问题与以下事实有关:乔列斯基分解参数化要求由该乘积形成的协方差矩阵要么是正定的,要么是奇异的。尽管总体协方差矩阵可能是正定的,但抽样误差以及遗传和环境矩阵的推导性质(而非直接观测性质)的结合会导致出现负(半)定矩阵。当这种情况发生时,拟合乔列斯基分解会限制搜索的数值范围,并损害行为遗传学中目前使用的最大似然理论。在理解这一现象的原因并开发出令人满意的解决方案之前,那些拟合乔列斯基矩阵的人面临着证明其拟合统计量和模型比较自由度有效性的负担。本文提出了一种临时补救措施——拟合一个无约束模型和一个乔列斯基模型,如果两者不同,则报告拟合统计量和参数估计值的差异。乔列斯基问题是程度问题,而非类型问题。因此,一些乔列斯基分解的解决方案与无约束解决方案的差异可能微不足道,必须通过这两种方法导致对结果的实质性解释不同的频率来评估这些问题的重要性。如果遵循所提出的临时补救措施,将产生一批实证数据,以评估乔列斯基问题在多大程度上是重要的实质性问题而非统计上的奇闻轶事。