Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, California, United States of America.
PLoS Genet. 2010 Sep 2;6(9):e1001096. doi: 10.1371/journal.pgen.1001096.
We consider the feasibility of reusing existing control data obtained in genetic association studies in order to reduce costs for new studies. We discuss controlling for the population differences between cases and controls that are implicit in studies utilizing external control data. We give theoretical calculations of the statistical power of a test due to Bourgain et al (Am J Human Genet 2003), applied to the problem of dealing with case-control differences in genetic ancestry related to population isolation or population admixture. Theoretical results show that there may exist bounds for the non-centrality parameter for a test of association that places limits on study power even if sample sizes can grow arbitrarily large. We apply this method to data from a multi-center, geographically-diverse, genome-wide association study of breast cancer in African-American women. Our analysis of these data shows that admixture proportions differ by center with the average fraction of European admixture ranging from approximately 20% for participants from study sites in the Eastern United States to 25% for participants from West Coast sites. However, these differences in average admixture fraction between sites are largely counterbalanced by considerable diversity in individual admixture proportion within each study site. Our results suggest that statistical correction for admixture differences is feasible for future studies of African-Americans, utilizing the existing controls from the African-American Breast Cancer study, even if case ascertainment for the future studies is not balanced over the same centers or regions that supplied the controls for the current study.
我们考虑了重复利用现有遗传关联研究中获得的控制数据的可行性,以降低新研究的成本。我们讨论了控制利用外部对照数据进行研究时病例和对照之间隐含的人群差异。我们对 Bourgain 等人提出的检验统计功效进行了理论计算(Am J Human Genet 2003),并将其应用于处理与种群隔离或种群混合相关的遗传祖先的病例对照差异的问题。理论结果表明,即使样本量可以任意增大,关联检验的非中心参数可能存在界限,这会限制研究功效。我们将这种方法应用于一项多中心、地理分布广泛的非洲裔美国妇女乳腺癌全基因组关联研究的数据。我们对这些数据的分析表明,混合比例因中心而异,东部美国研究点的参与者的平均欧洲混合比例约为 20%,而西海岸研究点的参与者的平均欧洲混合比例约为 25%。然而,每个研究点内个体混合比例的多样性在很大程度上抵消了各研究点之间混合比例的平均差异。我们的研究结果表明,即使未来的研究在确定病例时不是平衡地覆盖当前研究的同一中心或地区,也可以利用现有的非洲裔美国人对照数据进行针对非洲裔美国人的混合差异的统计校正,从而进行未来的研究。