Schaid D J
Department of Health Sciences Research, Mayo Clinic/Foundation, Rochester, Minnesota 55905, USA.
Genet Epidemiol. 1996;13(5):423-49. doi: 10.1002/(SICI)1098-2272(1996)13:5<423::AID-GEPI1>3.0.CO;2-3.
Association studies of genetic markers with disease play a critical role in the dissection of genetically complex traits because they are relatively easy to conduct and are useful for fine-scale mapping of genetic traits. The advantage of family-based controls has recently received much attention because spurious associations caused by population structure can be controlled for, and marker genotype information on diseased cases and their parents can be used to test the compound hypothesis of both linkage and linkage disequilibrium. However, debate still exists regarding the statistical methods of analysis. Herein are presented statistical methods to test for linkage (in the presence of linkage disequilibrium) between multiallelic genetic markers and disease when diseased subjects (cases) and their parents are sampled. Theoretical considerations for the development of general statistical tests are presented as well as asymptotic formulas to compute their power when planning a study. Furthermore, simulation results for nine specific statistics are used to contrast the power of these methods under different genetic mechanisms leading to disease (dominant vs. recessive, one vs. two high-risk alleles). These results demonstrate substantial gains in power for specific statistical tests designed to detect specified genetic mechanisms. However, without a priori knowledge of the likely genetic mechanism, it is desirable to rely on a fairly robust statistical method, robust so that power is not drastically lost when either dominant or recessive mechanisms are acting, and when either one or more than one marker alleles are associated with disease. Based on both theoretical and simulation results, a general score statistic, which generalizes the transmission/disequilibrium test, tends to offer sufficient power for a variety of genetic mechanisms, so that it is worth considering for broad use in studies which use genetic marker information from both diseased cases and their parents.
基因标记与疾病的关联研究在剖析遗传复杂性状方面发挥着关键作用,因为它们相对易于开展,且有助于对遗传性状进行精细定位。基于家系的对照的优势近来备受关注,因为可以控制由群体结构导致的虚假关联,并且患病个体及其父母的标记基因型信息可用于检验连锁和连锁不平衡的复合假设。然而,关于分析的统计方法仍存在争议。本文介绍了在对患病个体(病例)及其父母进行抽样时,检验多等位基因遗传标记与疾病之间连锁(存在连锁不平衡时)的统计方法。给出了通用统计检验开发的理论考量以及在研究设计时计算其效能的渐近公式。此外,使用九个特定统计量的模拟结果来对比这些方法在导致疾病的不同遗传机制(显性与隐性、一个与两个高危等位基因)下的效能。这些结果表明,针对检测特定遗传机制设计的特定统计检验在效能上有显著提高。然而,如果事先不知道可能的遗传机制,就需要依赖一种相当稳健的统计方法,这种方法要稳健到当显性或隐性机制起作用时,以及当一个或多个标记等位基因与疾病相关时,效能不会大幅损失。基于理论和模拟结果,一种推广了传递/不平衡检验的通用得分统计量,往往能为多种遗传机制提供足够的效能,因此值得在使用患病个体及其父母的遗传标记信息的研究中广泛考虑使用。