Suppr超能文献

病例对照关联分析中的两阶段设计。

Two-stage designs in case-control association analysis.

作者信息

Zuo Yijun, Zou Guohua, Zhao Hongyu

机构信息

Department of Statistics and Probability, Michigan State University, Michigan 48824, USA.

出版信息

Genetics. 2006 Jul;173(3):1747-60. doi: 10.1534/genetics.105.042648. Epub 2006 Apr 19.

Abstract

DNA pooling is a cost-effective approach for collecting information on marker allele frequency in genetic studies. It is often suggested as a screening tool to identify a subset of candidate markers from a very large number of markers to be followed up by more accurate and informative individual genotyping. In this article, we investigate several statistical properties and design issues related to this two-stage design, including the selection of the candidate markers for second-stage analysis, statistical power of this design, and the probability that truly disease-associated markers are ranked among the top after second-stage analysis. We have derived analytical results on the proportion of markers to be selected for second-stage analysis. For example, to detect disease-associated markers with an allele frequency difference of 0.05 between the cases and controls through an initial sample of 1000 cases and 1000 controls, our results suggest that when the measurement errors are small (0.005), approximately 3% of the markers should be selected. For the statistical power to identify disease-associated markers, we find that the measurement errors associated with DNA pooling have little effect on its power. This is in contrast to the one-stage pooling scheme where measurement errors may have large effect on statistical power. As for the probability that the disease-associated markers are ranked among the top in the second stage, we show that there is a high probability that at least one disease-associated marker is ranked among the top when the allele frequency differences between the cases and controls are not <0.05 for reasonably large sample sizes, even though the errors associated with DNA pooling in the first stage are not small. Therefore, the two-stage design with DNA pooling as a screening tool offers an efficient strategy in genomewide association studies, even when the measurement errors associated with DNA pooling are nonnegligible. For any disease model, we find that all the statistical results essentially depend on the population allele frequency and the allele frequency differences between the cases and controls at the disease-associated markers. The general conclusions hold whether the second stage uses an entirely independent sample or includes both the samples used in the first stage and an independent set of samples.

摘要

DNA池化是一种在基因研究中收集标记等位基因频率信息的经济高效方法。它常被建议作为一种筛选工具,用于从大量标记中识别候选标记子集,以便后续通过更准确且信息丰富的个体基因分型进行跟进。在本文中,我们研究了与这种两阶段设计相关的几个统计特性和设计问题,包括用于第二阶段分析的候选标记的选择、该设计的统计功效,以及真正与疾病相关的标记在第二阶段分析后位列前茅的概率。我们得出了关于第二阶段分析要选择的标记比例的分析结果。例如,要通过1000例病例和1000例对照的初始样本检测病例与对照之间等位基因频率差异为0.05的疾病相关标记,我们的结果表明,当测量误差较小时(0.005),大约3%的标记应被选中。对于识别疾病相关标记的统计功效,我们发现与DNA池化相关的测量误差对其功效影响很小。这与单阶段池化方案形成对比,在单阶段池化方案中测量误差可能对统计功效有很大影响。至于疾病相关标记在第二阶段位列前茅的概率,我们表明,对于合理大的样本量,当病例与对照之间的等位基因频率差异不小于0.05时,即使第一阶段与DNA池化相关的误差不小,至少有一个疾病相关标记位列前茅的概率也很高。因此,以DNA池化作为筛选工具的两阶段设计在全基因组关联研究中提供了一种有效的策略,即使与DNA池化相关的测量误差不可忽略。对于任何疾病模型,我们发现所有统计结果本质上都取决于群体等位基因频率以及疾病相关标记处病例与对照之间的等位基因频率差异。无论第二阶段是使用完全独立的样本,还是包括第一阶段使用的样本和一组独立样本,一般结论都成立。

相似文献

1
Two-stage designs in case-control association analysis.病例对照关联分析中的两阶段设计。
Genetics. 2006 Jul;173(3):1747-60. doi: 10.1534/genetics.105.042648. Epub 2006 Apr 19.
7
Impact and quantification of the sources of error in DNA pooling designs.DNA混合设计中误差来源的影响及量化
Ann Hum Genet. 2009 Jan;73(1):118-24. doi: 10.1111/j.1469-1809.2008.00486.x. Epub 2008 Oct 15.

引用本文的文献

本文引用的文献

6
Association testing by DNA pooling: an effective initial screen.通过DNA池进行关联测试:一种有效的初步筛选方法。
Proc Natl Acad Sci U S A. 2002 Dec 24;99(26):16871-4. doi: 10.1073/pnas.262671399. Epub 2002 Dec 10.
9
Two-stage designs for gene-disease association studies.基因-疾病关联研究的两阶段设计
Biometrics. 2002 Mar;58(1):163-70. doi: 10.1111/j.0006-341x.2002.00163.x.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验