Department of Ecology & Evolution, University of Chicago, Chicago, IL 60637.
Department of Biological Sciences, Columbia University, New York, NY 10027.
Proc Natl Acad Sci U S A. 2024 Sep 17;121(38):e2401379121. doi: 10.1073/pnas.2401379121. Epub 2024 Sep 13.
Family-based genome-wide association studies (GWASs) are often claimed to provide an unbiased estimate of the average causal effects (or average treatment effects; ATEs) of alleles, on the basis of an analogy between the random transmission of alleles from parents to children and a randomized controlled trial. We show that this claim does not hold in general. Because Mendelian segregation only randomizes alleles among children of heterozygotes, the effects of alleles in the children of homozygotes are not observable. This feature will matter if an allele has different average effects in the children of homozygotes and heterozygotes, as can arise in the presence of gene-by-environment interactions, gene-by-gene interactions, or differences in linkage disequilibrium patterns. At a single locus, family-based GWAS can be thought of as providing an unbiased estimate of the average effect in the children of heterozygotes (i.e., a local average treatment effect; LATE). This interpretation does not extend to polygenic scores (PGSs), however, because different sets of SNPs are heterozygous in each family. Therefore, other than under specific conditions, the within-family regression slope of a PGS cannot be assumed to provide an unbiased estimate of the LATE for any subset or weighted average of families. In practice, the potential biases of a family-based GWAS are likely smaller than those that can arise from confounding in a standard, population-based GWAS, and so family studies remain important for the dissection of genetic contributions to phenotypic variation. Nonetheless, their causal interpretation is less straightforward than has been widely appreciated.
基于家系的全基因组关联研究(GWAS)通常声称,基于等位基因从父母随机传递给子女与随机对照试验之间的类比,可以对等位基因的平均因果效应(或平均处理效应;ATE)进行无偏估计。我们表明,这种说法并不普遍成立。因为孟德尔分离只会使杂合子子女中的等位基因随机化,所以纯合子子女中等位基因的效应是不可观察的。如果一个等位基因在纯合子和杂合子子女中的平均效应不同,这种特征就会很重要,这种情况可能出现在基因-环境相互作用、基因-基因相互作用或连锁不平衡模式差异的情况下。在单个基因座上,基于家系的 GWAS 可以被视为对杂合子子女中平均效应的无偏估计(即局部平均处理效应;LATE)。然而,这种解释不适用于多基因评分(PGS),因为每个家系中的不同 SNP 集合都是杂合的。因此,除了特定条件外,PGS 的家内回归斜率不能被假定为任何亚组或加权平均家庭的 LATE 的无偏估计。在实践中,基于家系的 GWAS 的潜在偏差可能小于标准基于人群的 GWAS 中混杂引起的偏差,因此家系研究对于解析遗传对表型变异的贡献仍然很重要。尽管如此,它们的因果解释并不像人们普遍认为的那样简单。