Schaid D J, Sommer S S
Department of Health Sciences Research, Mayo Clinic/Foundation, Rochester, MN 55905.
Am J Hum Genet. 1993 Nov;53(5):1114-26.
Design and analysis methods are presented for studying the association of a candidate gene with a disease by using parental data in place of nonrelated controls. This alternative design eliminates spurious differences in allele frequencies between cases and nonrelated controls resulting from different ethnic origins and population stratification for these two groups. We present analysis methods which are based on two genetic relative risks: (1) the relative risk of disease for homozygotes with two copies of the candidate gene versus homozygotes without the candidate gene and (2) the relative risk for heterozygotes with one copy of the candidate gene versus homozygotes without the candidate gene. In addition to estimating the magnitude of these relative risks, likelihood methods allow specific hypotheses to be tested, namely, a test for overall association of the candidate gene with disease, as well as specific genetic hypotheses, such as dominant or recessive inheritance. Two likelihood methods are presented: (1) a likelihood method appropriate when Hardy-Weinberg equilibrium holds and (2) a likelihood method in which we condition on parental genotype data when Hardy-Weinberg equilibrium does not hold. The results for the relative efficiency of these two methods suggest that the conditional approach may at times be preferable, even when equilibrium holds. Sample-size and power calculations are presented for a multitiered design. The purpose of tier 1 is to detect the presence of an abnormal sequence for a postulated candidate gene among a small group of cases. The purpose of tier 2 is to test for association of the abnormal variant with disease, such as by the likelihood methods presented. The purpose of tier 3 is to confirm positive results from tier 2. Results indicate that required sample sizes are smaller when expression of disease is recessive, rather than dominant, and that, for recessive disease and large relative risks, necessary sample sizes may be feasible, even if only a small percentage of the disease can be attributed to the candidate gene.
本文介绍了通过使用亲本数据替代无关对照来研究候选基因与疾病关联的设计和分析方法。这种替代设计消除了病例组和无关对照组之间由于两组不同种族起源和群体分层导致的等位基因频率的虚假差异。我们提出了基于两种遗传相对风险的分析方法:(1)携带两份候选基因拷贝的纯合子与不携带候选基因的纯合子相比的疾病相对风险;(2)携带一份候选基因拷贝的杂合子与不携带候选基因的纯合子相比的相对风险。除了估计这些相对风险的大小外,似然方法还允许检验特定假设,即检验候选基因与疾病的总体关联,以及特定的遗传假设,如显性或隐性遗传。本文提出了两种似然方法:(1)在哈迪-温伯格平衡成立时适用的似然方法;(2)在哈迪-温伯格平衡不成立时基于亲本基因型数据的似然方法。这两种方法相对效率的结果表明,即使平衡成立,条件方法有时可能更可取。本文给出了多层设计的样本量和功效计算。第一层的目的是在一小部分病例中检测假定候选基因的异常序列的存在。第二层的目的是检验异常变体与疾病的关联,例如通过本文提出的似然方法。第三层的目的是确认第二层的阳性结果。结果表明,当疾病表现为隐性而非显性时,所需样本量较小,并且对于隐性疾病和较大的相对风险,即使只有一小部分疾病可归因于候选基因,所需样本量也可能是可行的。