Suppr超能文献

利用常见汇总统计量量化易感性基因座的后效大小分布。

Quantifying posterior effect size distribution of susceptibility loci by common summary statistics.

机构信息

Biostatistics Department, University of Kentucky, Lexington, Kentucky.

Biostatistics and Computational Biology, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina.

出版信息

Genet Epidemiol. 2020 Jun;44(4):339-351. doi: 10.1002/gepi.22286. Epub 2020 Feb 25.

Abstract

Testing millions of single nucleotide polymorphisms (SNPs) in genetic association studies has become a standard routine for disease gene discovery. In light of recent re-evaluation of statistical practice, it has been suggested that p-values are unfit as summaries of statistical evidence. Despite this criticism, p-values contain information that can be utilized to address the concerns about their flaws. We present a new method for utilizing evidence summarized by p-values for estimating odds ratio (OR) based on its approximate posterior distribution. In our method, only p-values, sample size, and standard deviation for ln(OR) are needed as summaries of data, accompanied by a suitable prior distribution for ln(OR) that can assume any shape. The parameter of interest, ln(OR), is the only parameter with a specified prior distribution, hence our model is a mix of classical and Bayesian approaches. We show that our method retains the main advantages of the Bayesian approach: it yields direct probability statements about hypotheses for OR and is resistant to biases caused by selection of top-scoring SNPs. Our method enjoys greater flexibility than similarly inspired methods in the assumed distribution for the summary statistic and in the form of the prior for the parameter of interest. We illustrate our method by presenting interval estimates of effect size for reported genetic associations with lung cancer. Although we focus on OR, the method is not limited to this particular measure of effect size and can be used broadly for assessing reliability of findings in studies testing multiple predictors.

摘要

在遗传关联研究中测试数百万个单核苷酸多态性 (SNP) 已成为发现疾病基因的标准程序。鉴于最近对统计实践的重新评估,有人认为 p 值不适合作为统计证据的总结。尽管存在这种批评,但 p 值包含可以用来解决其缺陷问题的信息。我们提出了一种新方法,用于利用 p 值总结的证据来估计基于其近似后验分布的优势比 (OR)。在我们的方法中,仅需要 p 值、样本量和 ln(OR)的标准差作为数据的摘要,并伴随一个适合的 ln(OR)的先验分布,该分布可以采用任何形状。感兴趣的参数 ln(OR) 是唯一具有指定先验分布的参数,因此我们的模型是经典和贝叶斯方法的混合体。我们表明,我们的方法保留了贝叶斯方法的主要优势:它对 OR 的假设直接给出了概率陈述,并且不易受到选择最高得分 SNP 引起的偏差的影响。与受类似启发的方法相比,我们的方法在汇总统计数据的假设分布和感兴趣参数的先验形式方面具有更大的灵活性。我们通过为报道的肺癌遗传关联呈现效应大小的区间估计来说明我们的方法。尽管我们专注于 OR,但该方法不仅限于此特定的效应量度量,并且可以广泛用于评估测试多个预测因子的研究中发现的可靠性。

相似文献

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验