Sasieni P D
Department of Mathematics, Statistics and Epidemiology, Imperial Cancer Research Fund, London, U.K.
Biometrics. 1997 Dec;53(4):1253-61.
This paper considers the analysis of genetic case-control data. One approach considers the allele frequency in cases and controls. Because each individual has two alleles at any autosomal locus, there will be twice as many alleles as people. Another approach considers the risk of the disease in those who do not have the allele of interest (A), those who have a single copy (heterozygous), and those who are homozygous for A. A third approach does not differentiate between individuals with one or two copies of A. This was common when alleles were determined serologically and one could not distinguish between homozygotes and those with one copy of A and one of an unknown allele. All three approaches have been used in the literature, but this is the first systematic comparison of them. The different interpretations of the odds ratios from such analyses are explored and conditions are given under which the first two approaches are asymptotically equivalent. The chi-squared statistics from the three approaches are discussed. Both the odds ratio and the chi-squared statistic from the analysis that treats alleles rather than genotypes as individual entities are appropriate only when the Hardy-Weinberg equilibrium holds. When the equilibrium holds, the allele-based test statistic is asymptotically equivalent to the test for trend using the genotype data. Thus, analyses that treat alleles rather than people as observations should not be used. Instead, we recommend that such data should be analyzed by genotype.
本文考虑对基因病例对照数据的分析。一种方法是考虑病例组和对照组中的等位基因频率。由于每个个体在任何常染色体位点都有两个等位基因,所以等位基因的数量将是个体数量的两倍。另一种方法是考虑在那些没有感兴趣的等位基因(A)的个体、有一个拷贝(杂合子)的个体以及A基因纯合子个体中患该疾病的风险。第三种方法不区分具有一个或两个A拷贝的个体。在通过血清学确定等位基因且无法区分纯合子与具有一个A拷贝和一个未知等位基因的个体时,这种方法很常见。所有这三种方法都在文献中有所使用,但这是对它们的首次系统比较。探讨了此类分析中比值比的不同解释,并给出了前两种方法渐近等效的条件。讨论了三种方法的卡方统计量。仅当哈迪 - 温伯格平衡成立时,将等位基因而非基因型视为个体实体进行分析得到的比值比和卡方统计量才是合适的。当平衡成立时,基于等位基因的检验统计量渐近等同于使用基因型数据进行趋势检验的统计量。因此,不应使用将等位基因而非个体作为观察对象的分析方法。相反,我们建议此类数据应按基因型进行分析。