基于单标记家系的关联分析，不依赖父母信息。

Single Marker Family-Based Association Analysis Not Conditional on Parental Information.

作者信息

Namkung Junghyun, Won Sungho

机构信息

Molecular Diagnostics Team, IVD Business Unit, SK Telecom, SK T-tower 65 Eulji-ro, Jung-gu, 04539, Seoul, South Korea.

Department of Public Health Science, Graduate School of Public Health, Seoul National University, Seoul, South Korea.

出版信息

Methods Mol Biol. 2017;1666:409-439. doi: 10.1007/978-1-4939-7274-6_20.

DOI:10.1007/978-1-4939-7274-6_20

PMID:28980257

Abstract

Family-based association analysis unconditional on parental genotypes models the effects of observed genotypes. This approach has been shown to have greater power than conditional methods. In this chapter, we review popular association analysis methods accounting for familial correlations: the marginal model using generalized estimating equations (GEE), the mixed model with a polygenic random component, and genome-wide association analyses. The marginal approach does not explicitly model familial correlations but uses the information to improve the efficiency of parameter estimates. This model, using GEE, is useful when the correlation structure is not of interest; the correlations are treated as nuisance parameters. In the mixed model, familial correlations are modeled as random effects, e.g., the polygenic inheritance model accounts for correlations originating from shared genomic components within a family. These unconditional methods provide a flexible modeling framework for general pedigree data to accommodate traits with various distributions and many types of covariate effects. Genome-wide association studies usually test more than 10,000 SNPs and thus traditional statistical methods accounting for the familial correlations often suffer from a computational burden. Multiple approaches that have been recently proposed to avoid this computational issue are reviewed. The single-marker analysis procedures are demonstrated using the R package gee and the ASSOC program in the S.A.G.E. package, including how to prepare input data, conduct the analysis, and interpret the output. ASSOC allows models to include random components of additional familial correlations that may be not sufficiently explained by a polygenic effect and addresses nonnormality of response variables by transformation methods. With its ease of use, ASSOC provides a useful tool for association analysis of large pedigree data.

摘要

基于家系的关联分析在不考虑亲本基因型的情况下对观察到的基因型效应进行建模。这种方法已被证明比条件方法具有更强的效力。在本章中，我们回顾了考虑家族相关性的常用关联分析方法：使用广义估计方程（GEE）的边际模型、具有多基因随机成分的混合模型以及全基因组关联分析。边际方法没有明确对家族相关性进行建模，但利用这些信息来提高参数估计的效率。当相关结构不是关注重点时，使用GEE的这个模型很有用；相关性被视为干扰参数。在混合模型中，家族相关性被建模为随机效应，例如，多基因遗传模型解释了源自家族内共享基因组成分的相关性。这些非条件方法为一般系谱数据提供了一个灵活的建模框架，以适应具有各种分布和多种类型协变量效应的性状。全基因组关联研究通常会检测超过10,000个单核苷酸多态性（SNP），因此考虑家族相关性的传统统计方法常常面临计算负担。我们回顾了最近为避免这个计算问题而提出的多种方法。使用R包gee和S.A.G.E.包中的ASSOC程序演示了单标记分析程序，包括如何准备输入数据、进行分析以及解释输出。ASSOC允许模型纳入可能无法被多基因效应充分解释的额外家族相关性的随机成分，并通过转换方法解决响应变量的非正态性问题。凭借其易用性，ASSOC为大型系谱数据的关联分析提供了一个有用的工具。