Division of Biostatistics, University of Minnesota, Minneapolis, MN, USA.
Department of Psychology, University of Minnesota, Minneapolis, MN, USA.
Behav Genet. 2020 Nov;50(6):423-439. doi: 10.1007/s10519-020-10010-2. Epub 2020 Aug 17.
Genome-wide association studies (GWASs) are a popular tool for detecting association between genetic variants or single nucleotide polymorphisms (SNPs) and complex traits. Family data introduce complexity due to the non-independence of the family members. Methods for non-independent data are well established, but when the GWAS contains distinct family types, explicit modeling of between-family-type differences in the dependence structure comes at the cost of significantly increased computational burden. The situation is exacerbated with binary traits. In this paper, we perform several simulation studies to compare multiple candidate methods to perform single SNP association analysis with binary traits. We consider generalized estimating equations (GEE), generalized linear mixed models (GLMMs), or generalized least square (GLS) approaches. We study the influence of different working correlation structures for GEE on the GWAS findings and also the performance of different analysis method(s) to conduct a GWAS with binary trait data in families. We discuss the merits of each approach with attention to their applicability in a GWAS. We also compare the performances of the methods on the alcoholism data from the Minnesota Center for Twin and Family Research (MCTFR) study.
全基因组关联研究(GWAS)是一种用于检测遗传变异或单核苷酸多态性(SNP)与复杂特征之间关联的流行工具。由于家庭成员之间的非独立性,家族数据会带来复杂性。针对非独立数据的方法已经得到很好的建立,但当 GWAS 包含不同的家族类型时,显式建模家族类型之间的依赖性结构差异会显著增加计算负担。这种情况在二元特征中更加严重。在本文中,我们进行了多项模拟研究,比较了多种候选方法,以对二元特征进行单 SNP 关联分析。我们考虑了广义估计方程(GEE)、广义线性混合模型(GLMM)或广义最小二乘法(GLS)方法。我们研究了不同工作相关结构对 GEE 对 GWAS 结果的影响,以及不同分析方法在家族中进行二元特征数据的 GWAS 中的性能。我们讨论了每种方法的优点,并注意到它们在 GWAS 中的适用性。我们还比较了这些方法在明尼苏达州双胞胎和家庭研究中心(MCTFR)研究的酗酒数据上的表现。