Redden David T, Divers Jasmin, Vaughan Laura Kelly, Tiwari Hemant K, Beasley T Mark, Fernández José R, Kimberly Robert P, Feng Rui, Padilla Miguel A, Liu Nianjun, Miller Michael B, Allison David B
Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America.
PLoS Genet. 2006 Aug 25;2(8):e137. doi: 10.1371/journal.pgen.0020137. Epub 2006 Jul 18.
Individual genetic admixture estimates, determined both across the genome and at specific genomic regions, have been proposed for use in identifying specific genomic regions harboring loci influencing phenotypes in regional admixture mapping (RAM). Estimates of individual ancestry can be used in structured association tests (SAT) to reduce confounding induced by various forms of population substructure. Although presented as two distinct approaches, we provide a conceptual framework in which both RAM and SAT are special cases of a more general linear model. We clarify which variables are sufficient to condition upon in order to prevent spurious associations and also provide a simple closed form "semiparametric" method of evaluating the reliability of individual admixture estimates. An estimate of the reliability of individual admixture estimates is required to make an inherent errors-in-variables problem tractable. Casting RAM and SAT methods as a general linear model offers enormous flexibility enabling application to a rich set of phenotypes, populations, covariates, and situations, including interaction terms and multilocus models. This approach should allow far wider use of RAM and SAT, often using standard software, in addressing admixture as either a confounder of association studies or a tool for finding loci influencing complex phenotypes in species as diverse as plants, humans, and nonhuman animals.
已有人提出在全基因组和特定基因组区域确定个体遗传混合估计值,以用于在区域混合映射(RAM)中识别含有影响表型位点的特定基因组区域。个体祖先估计值可用于结构化关联测试(SAT),以减少各种形式的群体亚结构所引起的混杂。尽管呈现为两种不同的方法,但我们提供了一个概念框架,其中RAM和SAT都是一个更通用线性模型的特殊情况。我们阐明了为防止虚假关联需要以哪些变量为条件,并提供了一种简单的封闭形式“半参数”方法来评估个体混合估计值的可靠性。为了使固有的变量误差问题易于处理,需要对个体混合估计值的可靠性进行估计。将RAM和SAT方法视为通用线性模型提供了极大的灵活性,能够应用于丰富的一系列表型、群体、协变量和情况,包括交互项和多位点模型。这种方法应该允许更广泛地使用RAM和SAT,通常使用标准软件,将混合作为关联研究的混杂因素或作为在植物、人类和非人类动物等多种物种中寻找影响复杂表型位点的工具来处理。