Prentice Ross L, Qi Lihong
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
Biostatistics. 2006 Jul;7(3):339-54. doi: 10.1093/biostatistics/kxj020. Epub 2006 Jan 27.
The state of readiness for high-dimensional single nucleotide polymorphism (SNP) epidemiologic association studies is described, as background for a discussion of statistical aspects of case-control study design and analysis. Specifically, the important role that multistage designs can play in the elimination of false-positive associations and in the control of study costs will be noted. Also, the trade-offs associated with using pooled DNA at early design stages for additional important cost reductions will be discussed in some detail. An odds ratio approach to relating SNP alleles to disease risk using pooled DNA will be proposed, in conjunction with a simple empirical variance estimator, based on comparisons among log-odds ratio estimators from distinct pairs of case and control pools. Simulation studies will be presented to evaluate the moderate sample size properties of such multistage designs and estimation procedures. The design of an ongoing three-stage study in the Women's Health Initiative to relate 250,000 SNPs to the risk of coronary heart disease, stroke, and breast cancer will provide illustration, and will be used to motivate the choice of simulation configurations.
本文描述了高维单核苷酸多态性(SNP)流行病学关联研究的准备状态,作为讨论病例对照研究设计与分析统计方面的背景。具体而言,将指出多阶段设计在消除假阳性关联和控制研究成本方面可发挥的重要作用。此外,还将详细讨论在设计早期使用混合DNA以进一步大幅降低成本所涉及的权衡。将结合基于不同病例组和对照组混合样本对数优势比估计值之间比较的简单经验方差估计器,提出一种使用混合DNA将SNP等位基因与疾病风险相关联的优势比方法。将展示模拟研究,以评估此类多阶段设计和估计程序在中等样本量情况下的特性。正在进行的女性健康倡议中的一项三阶段研究,旨在将二十五万个SNP与冠心病、中风和乳腺癌风险相关联,其设计将提供实例,并用于推动模拟配置的选择。