Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA.
Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA.
Genet Epidemiol. 2022 Jul;46(5-6):266-284. doi: 10.1002/gepi.22453. Epub 2022 Apr 22.
Genetic association studies of child health outcomes often employ family-based study designs. One of the most popular family-based designs is the case-parent trio design that considers the smallest possible nuclear family consisting of two parents and their affected child. This trio design is particularly advantageous for studying relatively rare disorders because it is less prone to type 1 error inflation due to population stratification compared to population-based study designs (e.g., case-control studies). However, obtaining genetic data from both parents is difficult, from a practical perspective, and many large studies predominantly measure genetic variants in mother-child dyads. While some statistical methods for analyzing parent-child dyad data (most commonly involving mother-child pairs) exist, it is not clear if they provide the same advantage as trio methods in protecting against population stratification, or if a specific dyad design (e.g., case-mother dyads vs. case-mother/control-mother dyads) is more advantageous. In this article, we review existing statistical methods for analyzing genome-wide marker data on dyads and perform extensive simulation experiments to benchmark their type I errors and statistical power under different scenarios. We extend our evaluation to existing methods for analyzing a combination of case-parent trios and dyads together. We apply these methods on genotyped and imputed data from multiethnic mother-child pairs only, case-parent trios only or combinations of both dyads and trios from the Gene, Environment Association Studies consortium (GENEVA), where each family was ascertained through a child affected by nonsyndromic cleft lip with or without cleft palate. Results from the GENEVA study corroborate the findings from our simulation experiments. Finally, we provide recommendations for using statistical genetic association methods for dyads.
儿童健康结果的遗传关联研究通常采用基于家庭的研究设计。最流行的基于家庭的设计之一是病例-父母-三体型设计,该设计考虑了由两个父母及其患病子女组成的最小可能的核心家庭。与基于人群的研究设计(例如病例对照研究)相比,这种三体型设计对于研究相对罕见的疾病特别有利,因为它因人群分层而导致 1 型错误膨胀的风险较低。然而,从实际角度来看,从父母双方获取遗传数据是困难的,许多大型研究主要测量母子二联体的遗传变异。虽然存在用于分析父母-子女二联体数据(最常见的是涉及母子对)的一些统计方法,但尚不清楚它们是否在防止人群分层方面与三体型方法提供相同的优势,或者特定的二联体设计(例如病例-母亲二联体与病例-母亲/对照-母亲二联体)是否更有利。在本文中,我们回顾了用于分析二联体全基因组标记数据的现有统计方法,并进行了广泛的模拟实验,以基准测试它们在不同情况下的 1 型错误和统计功效。我们将评估范围扩展到用于同时分析病例-父母三体型和二联体的现有方法。我们仅应用于从基因、环境关联研究协会(GENEVA)中多民族母子对、病例-父母三体型或二联体和三体型组合获得的基因分型和 imputed 数据的这些方法,其中每个家庭都是通过受非综合征性唇裂伴或不伴腭裂影响的儿童确定的。GENEVA 研究的结果证实了我们模拟实验的结果。最后,我们为使用统计遗传关联方法提供了建议。