Suppr超能文献

全基因组关联研究中比值比的偏差校正估计量和置信区间

Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies.

作者信息

Zhong Hua, Prentice Ross L

机构信息

Department of Biostatistics, University of Washington, Seattle, WA 98105, USA.

出版信息

Biostatistics. 2008 Oct;9(4):621-34. doi: 10.1093/biostatistics/kxn001. Epub 2008 Feb 28.

Abstract

Genome-wide association studies (GWAS) provide an important approach to identifying common genetic variants that predispose to human disease. A typical GWAS may genotype hundreds of thousands of single nucleotide polymorphisms (SNPs) located throughout the human genome in a set of cases and controls. Logistic regression is often used to test for association between a SNP genotype and case versus control status, with corresponding odds ratios (ORs) typically reported only for those SNPs meeting selection criteria. However, when these estimates are based on the original data used to detect the variant, the results are affected by a selection bias sometimes referred to the "winner's curse" (Capen and others, 1971). The actual genetic association is typically overestimated. We show that such selection bias may be severe in the sense that the conditional expectation of the standard OR estimator may be quite far away from the underlying parameter. Also standard confidence intervals (CIs) may have far from the desired coverage rate for the selected ORs. We propose and evaluate 3 bias-reduced estimators, and also corresponding weighted estimators that combine corrected and uncorrected estimators, to reduce selection bias. Their corresponding CIs are also proposed. We study the performance of these estimators using simulated data sets and show that they reduce the bias and give CI coverage close to the desired level under various scenarios, even for associations having only small statistical power.

摘要

全基因组关联研究(GWAS)为识别易患人类疾病的常见基因变异提供了一种重要方法。典型的GWAS可能会对一组病例和对照中位于整个人类基因组中的数十万个单核苷酸多态性(SNP)进行基因分型。逻辑回归通常用于检验SNP基因型与病例对照状态之间的关联性,通常仅报告那些符合选择标准的SNP的相应优势比(OR)。然而,当这些估计基于用于检测变异的原始数据时,结果会受到一种有时被称为“胜者之咒”的选择偏倚的影响(卡彭等人,1971年)。实际的基因关联性通常被高估。我们表明,这种选择偏倚可能很严重,因为标准OR估计量的条件期望可能与基础参数相差甚远。而且标准置信区间(CI)对于所选OR的覆盖率可能远低于期望水平。我们提出并评估了3种偏差减少估计量,以及将校正估计量和未校正估计量结合起来的相应加权估计量,以减少选择偏倚。还提出了它们相应的CI。我们使用模拟数据集研究了这些估计量的性能,结果表明它们在各种情况下都能减少偏差,并使CI覆盖率接近期望水平,即使对于统计功效较小的关联性也是如此。

相似文献

引用本文的文献

6
10
Point estimation following a two-stage group sequential trial.两阶段分组序贯试验后的点估计。
Stat Methods Med Res. 2023 Feb;32(2):287-304. doi: 10.1177/09622802221137745. Epub 2022 Nov 16.

本文引用的文献

1
Flexible design for following up positive findings.针对阳性结果随访的灵活设计。
Am J Hum Genet. 2007 Sep;81(3):540-51. doi: 10.1086/520678. Epub 2007 Aug 3.
2
Genomewide association analysis of coronary artery disease.冠状动脉疾病的全基因组关联分析。
N Engl J Med. 2007 Aug 2;357(5):443-53. doi: 10.1056/NEJMoa072366. Epub 2007 Jul 18.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验