Gail M H, Pee D, Benichou J, Carroll R
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland 20892-7368, USA.
Genet Epidemiol. 1999;16(1):15-39. doi: 10.1002/(SICI)1098-2272(1999)16:1<15::AID-GEPI3>3.0.CO;2-8.
One can obtain population-based estimates of the penetrance of a measurable mutation from cohort studies, from population-based case-control studies, and from genotyped-proband designs (GPD). In a GPD, we assume that representative individuals (probands) agree to be genotyped, and one then obtains information on the phenotypes of first-degree relatives. We also consider an extension of the GPD in which a relative is genotyped (GPDR design). In this paper, we give methods and tables for determining sample sizes needed to achieve desired precision for penetrance estimates from such studies. We emphasize dichotomous phenotypes, but methods for survival data are also given. In an example based on the BRCA1 gene and parameters given by Claus et al. [(1991) Am J Hum Genet 48:232-242], we find that similar large numbers of families need to be studied using the cohort, case-control, and GPD designs if the allele frequency is known, though the GPDR design requires fewer families, and, if one can study mainly probands with disease, the GPD design also requires fewer families. If the allele frequency is not known, somewhat larger sample sizes are required. Surprisingly, studies with mixtures of families of affected and non-affected probands can sometimes be more efficient than studies based exclusively on affected probands when the allele frequency is unknown. We discuss the feasibility and validity of these designs and point out that GPD and GPDR designs are more susceptible to a bias that results when the tendency for an individual to volunteer to be a proband or to be a subject in a cohort or case-control study depends on the phenotypes of his or her relatives.
可以通过队列研究、基于人群的病例对照研究以及基因分型先证者设计(GPD)来获得基于人群的可测量突变外显率估计值。在GPD设计中,我们假设具有代表性的个体(先证者)同意进行基因分型,然后获取一级亲属的表型信息。我们还考虑了GPD设计的一种扩展形式,即对亲属进行基因分型(GPDR设计)。在本文中,我们给出了用于确定此类研究中获得所需外显率估计精度所需样本量的方法和表格。我们重点关注二分法表型,但也给出了生存数据的方法。在一个基于BRCA1基因以及Claus等人[(1991)《美国人类遗传学杂志》48:232 - 242]给出的参数的示例中,我们发现,如果已知等位基因频率,使用队列、病例对照和GPD设计需要研究的家庭数量大致相同,尽管GPDR设计所需的家庭数量较少,并且,如果能够主要研究患病的先证者,GPD设计也需要较少的家庭。如果等位基因频率未知,则需要稍大的样本量。令人惊讶的是,当等位基因频率未知时,对患病和未患病先证者的家庭混合进行的研究有时可能比仅基于患病先证者的研究更有效。我们讨论了这些设计的可行性和有效性,并指出GPD和GPDR设计更容易受到一种偏差的影响,这种偏差是当个体自愿成为先证者或参与队列或病例对照研究的倾向取决于其亲属的表型时产生的。