Barber Mathew J, Cordell Heather J, MacGregor Alex J, Andrew Toby
Department of Medical Genetics, University of Cambridge, Cambridge, UK.
Genet Epidemiol. 2004 Feb;26(2):97-107. doi: 10.1002/gepi.10299.
Existing standard methods of linkage analysis for quantitative phenotypes rest on the assumptions of either ordinary least squares (Haseman and Elston [1972] Behav. Genet. 2:3-19; Sham and Purcell [2001] Am. J. Hum. Genet. 68:1527-1532) or phenotypic normality (Almasy and Blangero [1998] Am. J. Hum. Genet. 68:1198-1199; Kruglyak and Lander [1995] Am. J. Hum. Genet. 57:439-454). The limitations of both these methods lie in the specification of the error distribution in the respective regression analyses. In ordinary least squares regression, the residual distribution is misspecified as being independent of the mean level. Using variance components and assuming phenotypic normality, the dependency on the mean level is correctly specified, but the remaining residual coefficient of variation is constrained a priori. Here it is shown that these limitations can be addressed (for a sample of unselected sib-pairs) using a generalized linear model based on the gamma distribution, which can be readily implemented in any standard statistical software package. The generalized linear model approach can emulate variance components when phenotypic multivariate normality is assumed (Almasy and Blangero [1998] Am. J. Hum Genet. 68: 1198-1211) and is therefore more powerful than ordinary least squares, but has the added advantage of being robust to deviations from multivariate normality and provides (often overlooked) model-fit diagnostics for linkage analysis.
现有的针对数量性状的连锁分析标准方法基于普通最小二乘法(哈斯曼和埃尔斯顿[1972]《行为遗传学》2:3 - 19;沙姆和珀塞尔[2001]《美国人类遗传学杂志》68:1527 - 1532)或表型正态性(阿尔马西和布兰杰罗[1998]《美国人类遗传学杂志》68:1198 - 1199;克鲁格利亚克和兰德[1995]《美国人类遗传学杂志》57:439 - 454)的假设。这两种方法的局限性都在于各自回归分析中误差分布的设定。在普通最小二乘回归中,残差分布被错误设定为与均值水平无关。使用方差成分并假设表型正态性时,对均值水平的依赖性被正确设定,但剩余的残差变异系数被先验约束。本文表明,对于未选择的同胞对样本,可以使用基于伽马分布的广义线性模型来解决这些局限性,该模型可以在任何标准统计软件包中轻松实现。当假设表型多变量正态性时,广义线性模型方法可以模拟方差成分(阿尔马西和布兰杰罗[1998]《美国人类遗传学杂志》68:1198 - 1211),因此比普通最小二乘法更强大,但具有对多变量正态性偏差具有鲁棒性的额外优势,并为连锁分析提供(常被忽视的)模型拟合诊断。