Doan Betty Q, Sorant Alexa J M, Frangakis Constantine E, Bailey-Wilson Joan E, Shugart Yin Y
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
Eur J Hum Genet. 2006 Sep;14(9):1018-26. doi: 10.1038/sj.ejhg.5201650. Epub 2006 May 31.
Successful identification of genetic risk loci for complex diseases has relied on the ability to minimize disease and genetic heterogeneity to increase the power to detect linkage. One means to account for disease heterogeneity is by incorporating covariate data. However, the inclusion of each covariate will add one degree of freedom to the allele sharing based linkage test, which may in fact decrease power. We explore the application of a propensity score, which is typically used in causal inference to combine multiple covariates into a single variable, as a means of allowing for multiple covariates with the addition of only one degree of freedom. In this study, binary trait data, simulated under various models involving genetic and environmental effects, were analyzed using a nonparametric linkage statistic implemented in LODPAL. Power and type I error rates were evaluated. Results suggest that the use of the propensity score to combine multiple covariates as a single covariate consistently improves the power compared to an analysis including no covariates, each covariate individually, or all covariates simultaneously. Type I error rates were inflated for analyses with covariates and increased with increasing number of covariates, but reduced to nominal rates with sample sizes of 1000 families. Therefore, we recommend using the propensity score as a single covariate in the linkage analysis of a trait suspected to be influenced by multiple covariates because of its potential to increase the power to detect linkage, while controlling for the increase in the type I error.
成功识别复杂疾病的遗传风险位点依赖于将疾病和遗传异质性降至最低以增强检测连锁的能力。一种考虑疾病异质性的方法是纳入协变量数据。然而,每个协变量的纳入会给基于等位基因共享的连锁检验增加一个自由度,这实际上可能会降低检验效能。我们探讨倾向得分的应用,倾向得分通常用于因果推断,将多个协变量合并为一个单一变量,以此作为一种在仅增加一个自由度的情况下考虑多个协变量的方法。在本研究中,使用LODPAL中实现的非参数连锁统计量分析了在涉及遗传和环境效应的各种模型下模拟的二元性状数据。评估了检验效能和I型错误率。结果表明,与不纳入协变量、单独纳入每个协变量或同时纳入所有协变量的分析相比,使用倾向得分将多个协变量合并为一个单一协变量能持续提高检验效能。对于纳入协变量的分析,I型错误率会膨胀,且随着协变量数量的增加而增加,但在样本量为1000个家系时会降至名义水平。因此,我们建议在怀疑受多个协变量影响的性状的连锁分析中使用倾向得分作为单一协变量,因为它有提高检测连锁的能力的潜力,同时能控制I型错误的增加。