具有随机等位基因频率和基因型分布的基因关联研究的效能

Power for genetic association studies with random allele frequencies and genotype distributions.

作者信息

Ambrosius Walter T, Lange Ethan M, Langefeld Carl D

机构信息

Section on Biostatistics, Department of Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, NC, USA.

出版信息

Am J Hum Genet. 2004 Apr;74(4):683-93. doi: 10.1086/383282. Epub 2004 Mar 12.

DOI:10.1086/383282

PMID:15024689

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1181944/

Abstract

One of the first and most important steps in planning a genetic association study is the accurate estimation of the statistical power under a proposed study design and sample size. In association studies for candidate genes or in fine-mapping applications, allele and genotype frequencies are often assumed to be known when, in fact, they are unknown (i.e., random variables from some distribution). For example, if we consider a diallelic marker with allele frequencies of 0.5 and 0.5 and Hardy-Weinberg proportions, the three genotype frequencies are often assumed to be 0.25, 0.50, and 0.25, and the statistical power is calculated. Unfortunately, ignoring this source of variation can inflate the estimated power of the study. In the present article, we propose averaging the estimates of power over the distribution of the genotype frequencies to calculate the true estimate of power for a fixed allele frequency. For the usual situation, in which allele frequencies in a population are not known, we propose placing a prior distribution on the allele frequency, taking advantage of any available genotype information. This Bayesian approach provides a more accurate estimate of power. We present examples for quantitative and qualitative traits in cohort studies of unrelated individuals and results from an extensive series of examples that show that ignoring the uncertainty in allele frequencies can inflate the estimated power of the study. We also present the results from case-control studies and show that standard methods may also overestimate power. As discussed in this article, the approach of fixing allele frequencies even if they are not known is the common approach to power calculations. We show that ignoring the sources of variation in allele frequencies tends to result in overestimates of power and, consequently, in studies that are underpowered. Software in C is available at http://www.ambrosius.net/Power/.

摘要

开展基因关联研究时，首要且重要的步骤之一是根据拟定的研究设计和样本量准确估计统计效能。在候选基因关联研究或精细定位应用中，通常假定等位基因和基因型频率已知，而实际上它们是未知的（即来自某种分布的随机变量）。例如，对于一个等位基因频率分别为0.5和0.5且符合哈迪-温伯格比例的双等位基因标记，常假定三种基因型频率分别为0.25、0.50和0.25，并据此计算统计效能。遗憾的是，忽略这种变异来源会夸大研究的估计效能。在本文中，我们建议针对固定的等位基因频率，在基因型频率分布上对等效能估计值求平均，以计算效能的真实估计值。对于群体中等位基因频率未知的常见情况，我们建议利用任何可用的基因型信息，对等位基因频率设定一个先验分布。这种贝叶斯方法能提供更准确的效能估计值。我们给出了无关个体队列研究中数量性状和质量性状的示例，以及一系列广泛示例的结果，这些结果表明忽略等位基因频率的不确定性会夸大研究的估计效能。我们还给出了病例对照研究的结果，并表明标准方法也可能高估效能。如本文所讨论的，即使等位基因频率未知仍将其固定的方法是计算效能的常用方法。我们表明，忽略等位基因频率的变异来源往往会导致对效能的高估，从而导致研究效能不足。可从http://www.ambrosius.net/Power/获取用C语言编写的软件。

相似文献

Power for genetic association studies with random allele frequencies and genotype distributions.

Am J Hum Genet. 2004 Apr;74(4):683-93. doi: 10.1086/383282. Epub 2004 Mar 12.

Power of genetic association studies with fixed and random genotype frequencies.

Ann Hum Genet. 2010 Sep 1;74(5):429-38. doi: 10.1111/j.1469-1809.2010.00598.x. Epub 2010 Jul 21.

Biased tests of association: comparisons of allele frequencies when departing from Hardy-Weinberg proportions.

Am J Epidemiol. 1999 Apr 15;149(8):706-11. doi: 10.1093/oxfordjournals.aje.a009878.

Bayesian association mapping for quantitative traits in a mixture of two populations.

Genet Epidemiol. 2001;21 Suppl 1:S692-9. doi: 10.1002/gepi.2001.21.s1.s692.

On averaging power for genetic association and linkage studies.

Hum Hered. 2005;59(1):14-20. doi: 10.1159/000084732. Epub 2005 Mar 30.

Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits.

Bioinformatics. 2003 Jan;19(1):149-50. doi: 10.1093/bioinformatics/19.1.149.

The detection of gene-environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement?

Int J Epidemiol. 2003 Feb;32(1):51-7. doi: 10.1093/ije/dyg002.

On testing for genetic association in case-control studies when population allele frequencies are known.

Genet Epidemiol. 2009 Jul;33(5):371-8. doi: 10.1002/gepi.20375.

QTL fine mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations.

Am J Hum Genet. 2000 Mar;66(3):1027-45. doi: 10.1086/302804.

Importance of allele frequency estimates in epidemiological studies.

Mutat Res. 2004 Sep;567(1):63-70. doi: 10.1016/j.mrrev.2004.06.001.

引用本文的文献

Genetic Variants of Gene Promoter in Type 2 Diabetes.

Int J Endocrinol. 2023 Jan 28;2023:6919275. doi: 10.1155/2023/6919275. eCollection 2023.

Screening of mitochondrial mutations in Saudi women diagnosed with gestational diabetes mellitus: A non-replicative case-control study.

Saudi J Biol Sci. 2022 Jan;29(1):360-365. doi: 10.1016/j.sjbs.2021.08.102. Epub 2021 Sep 6.

Statistical distributions of test statistics used for quantitative trait association mapping in structured populations.

Genet Sel Evol. 2012 Nov 12;44(1):32. doi: 10.1186/1297-9686-44-32.

Design of the value of imaging in enhancing the wellness of your heart (VIEW) trial and the impact of uncertainty on power.

Clin Trials. 2012 Apr;9(2):232-46. doi: 10.1177/1740774512436882. Epub 2012 Feb 14.

The power of the sign test given uncertainty in the proportion of tied observations.

Contemp Clin Trials. 2011 Jan;32(1):147-50. doi: 10.1016/j.cct.2010.10.007. Epub 2010 Oct 23.

Estimation of effect size distribution from genome-wide association studies and implications for future discoveries.

Nat Genet. 2010 Jul;42(7):570-5. doi: 10.1038/ng.610. Epub 2010 Jun 20.

Power for studies with random group sizes.

Stat Med. 2010 May 10;29(10):1137-44. doi: 10.1002/sim.3873.

Statistical power of model selection strategies for genome-wide association studies.

PLoS Genet. 2009 Jul;5(7):e1000582. doi: 10.1371/journal.pgen.1000582. Epub 2009 Jul 31.

A population-based association study of glutamate decarboxylase 1 as a candidate gene for autism.

J Neural Transm (Vienna). 2009 Mar;116(3):381-8. doi: 10.1007/s00702-008-0142-4. Epub 2009 Jan 13.

Role of in silico tools in gene discovery.

Mol Biotechnol. 2009 Mar;41(3):296-306. doi: 10.1007/s12033-008-9134-8. Epub 2008 Dec 20.

本文引用的文献

Power calculations for a general class of family-based association tests: dichotomous traits.

Am J Hum Genet. 2002 Sep;71(3):575-84. doi: 10.1086/342406. Epub 2002 Aug 12.

Power and efficiency of the TDT and case-control design for association scans.

Behav Genet. 2002 Mar;32(2):135-44. doi: 10.1023/a:1015205924326.

Power calculations for genetic association studies using estimated probability distributions.

Am J Hum Genet. 2002 Jun;70(6):1480-9. doi: 10.1086/340788. Epub 2002 Apr 25.

The power of the transmission disequilibrium test (TDT) with both case-parent and control-parent trios.

Genet Res. 2001 Dec;78(3):289-302. doi: 10.1017/s001667230100533x.

Power and replication in case-control studies.

Am J Hypertens. 2002 Feb;15(2 Pt 1):201-5. doi: 10.1016/s0895-7061(01)02285-3.

Tests for genetic association using family data.

Genet Epidemiol. 2002 Feb;22(2):128-45. doi: 10.1002/gepi.0151.

The Finland-United States investigation of non-insulin-dependent diabetes mellitus genetics (FUSION) study. I. An autosomal genome scan for genes that predispose to type 2 diabetes.

Am J Hum Genet. 2000 Nov;67(5):1174-85. Epub 2000 Oct 13.

Genetic variants in the epithelial sodium channel in relation to aldosterone and potassium excretion and risk for hypertension.

Hypertension. 1999 Oct;34(4 Pt 1):631-7. doi: 10.1161/01.hyp.34.4.631.

The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits.

Genome Res. 1999 Aug;9(8):720-31.

Formulae and tables for the determination of sample sizes and power in clinical trials for testing differences in proportions for the two-sample design: a review.

Stat Med. 1996 Jan 15;15(1):1-21. doi: 10.1002/(SICI)1097-0258(19960115)15:1<1::AID-SIM134>3.0.CO;2-E.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

具有随机等位基因频率和基因型分布的基因关联研究的效能

Power for genetic association studies with random allele frequencies and genotype distributions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献