• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

先验信息、群体大小与全基因组假设检验中的功效。

Priors, population sizes, and power in genome-wide hypothesis tests.

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA.

Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA.

出版信息

BMC Bioinformatics. 2023 Apr 26;24(1):170. doi: 10.1186/s12859-023-05261-9.

DOI:10.1186/s12859-023-05261-9
PMID:37101120
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10134629/
Abstract

BACKGROUND

Genome-wide tests, including genome-wide association studies (GWAS) of germ-line genetic variants, driver tests of cancer somatic mutations, and transcriptome-wide association tests of RNAseq data, carry a high multiple testing burden. This burden can be overcome by enrolling larger cohorts or alleviated by using prior biological knowledge to favor some hypotheses over others. Here we compare these two methods in terms of their abilities to boost the power of hypothesis testing.

RESULTS

We provide a quantitative estimate for progress in cohort sizes and present a theoretical analysis of the power of oracular hard priors: priors that select a subset of hypotheses for testing, with an oracular guarantee that all true positives are within the tested subset. This theory demonstrates that for GWAS, strong priors that limit testing to 100-1000 genes provide less power than typical annual 20-40% increases in cohort sizes. Furthermore, non-oracular priors that exclude even a small fraction of true positives from the tested set can perform worse than not using a prior at all.

CONCLUSION

Our results provide a theoretical explanation for the continued dominance of simple, unbiased univariate hypothesis tests for GWAS: if a statistical question can be answered by larger cohort sizes, it should be answered by larger cohort sizes rather than by more complicated biased methods involving priors. We suggest that priors are better suited for non-statistical aspects of biology, such as pathway structure and causality, that are not yet easily captured by standard hypothesis tests.

摘要

背景

全基因组测试,包括胚系遗传变异的全基因组关联研究(GWAS)、癌症体细胞突变的驱动测试以及 RNAseq 数据的转录组关联测试,都存在很高的多重测试负担。这种负担可以通过招募更大的队列来克服,也可以通过利用先前的生物学知识来支持某些假设而不是其他假设来缓解。在这里,我们比较了这两种方法在增强假设检验能力方面的效果。

结果

我们提供了一种定量估计,用于衡量队列大小的进展,并对预测先验的功效进行了理论分析:预测先验选择了一组用于测试的假设,具有一个预测先验的保证,即所有的真阳性都在测试的假设中。该理论表明,对于 GWAS,将测试限制在 100-1000 个基因的强预测先验提供的功效不如每年增加 20-40%的典型队列大小。此外,即使排除了测试集中一小部分真阳性的非预测先验,其效果也可能不如完全不使用预测先验。

结论

我们的结果为 GWAS 中简单、无偏的单变量假设检验的持续主导地位提供了理论解释:如果一个统计问题可以通过更大的队列大小来回答,那么它应该通过更大的队列大小来回答,而不是通过涉及预测先验的更复杂的有偏方法来回答。我们建议预测先验更适合生物学的非统计方面,例如途径结构和因果关系,这些方面还不容易被标准的假设检验所捕捉。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/47b07e2a814b/12859_2023_5261_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/b63e7e8e4a72/12859_2023_5261_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/7cfe3fb6cdd7/12859_2023_5261_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/b62bf781c339/12859_2023_5261_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/e99756c4d357/12859_2023_5261_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/9b580043bd4b/12859_2023_5261_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/a3ce5dad92b0/12859_2023_5261_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/0ac9c734df1b/12859_2023_5261_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/a952c52f723c/12859_2023_5261_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/1bb0b6715ae5/12859_2023_5261_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/47b07e2a814b/12859_2023_5261_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/b63e7e8e4a72/12859_2023_5261_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/7cfe3fb6cdd7/12859_2023_5261_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/b62bf781c339/12859_2023_5261_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/e99756c4d357/12859_2023_5261_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/9b580043bd4b/12859_2023_5261_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/a3ce5dad92b0/12859_2023_5261_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/0ac9c734df1b/12859_2023_5261_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/a952c52f723c/12859_2023_5261_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/1bb0b6715ae5/12859_2023_5261_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/add2/10134629/47b07e2a814b/12859_2023_5261_Fig10_HTML.jpg

相似文献

1
Priors, population sizes, and power in genome-wide hypothesis tests.先验信息、群体大小与全基因组假设检验中的功效。
BMC Bioinformatics. 2023 Apr 26;24(1):170. doi: 10.1186/s12859-023-05261-9.
2
Statistical power of transcriptome-wide association studies.转录组关联研究的统计功效。
Genet Epidemiol. 2022 Dec;46(8):572-588. doi: 10.1002/gepi.22491. Epub 2022 Jun 29.
3
How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强?
Pac Symp Biocomput. 2018;23:228-239.
4
A knowledge-based weighting framework to boost the power of genome-wide association studies.基于知识的加权框架,提高全基因组关联研究的效力。
PLoS One. 2010 Dec 31;5(12):e14480. doi: 10.1371/journal.pone.0014480.
5
Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach.利用 GWAS 汇总数据和自适应检验方法整合多种性状,以检测新的性状-基因关联。
Bioinformatics. 2019 Jul 1;35(13):2251-2257. doi: 10.1093/bioinformatics/bty961.
6
Using GWAS top hits to inform priors in Bayesian fine-mapping association studies.利用 GWAS 顶级命中结果为贝叶斯精细映射关联研究提供先验信息。
Genet Epidemiol. 2019 Sep;43(6):675-689. doi: 10.1002/gepi.22212. Epub 2019 Jul 9.
7
Bayesian genome-wide TWAS with reference transcriptomic data of brain and blood tissues identified 141 risk genes for Alzheimer's disease dementia.基于大脑和血液组织参考转录组数据的贝叶斯全基因组 TWAS 鉴定出 141 个阿尔茨海默病痴呆风险基因。
Alzheimers Res Ther. 2024 Jun 1;16(1):120. doi: 10.1186/s13195-024-01488-7.
8
Using prior information from the medical literature in GWAS of oral cancer identifies novel susceptibility variant on chromosome 4--the AdAPT method.利用口腔癌 GWAS 中来自医学文献的先验信息,确定了染色体 4 上的新型易感性变异体——AdAPT 方法。
PLoS One. 2012;7(5):e36888. doi: 10.1371/journal.pone.0036888. Epub 2012 May 25.
9
Genome-wide association study meta-analysis identifies three novel loci for circulating anti-Müllerian hormone levels in women.全基因组关联研究荟萃分析确定了女性循环抗苗勒管激素水平的三个新基因座。
Hum Reprod. 2022 May 3;37(5):1069-1082. doi: 10.1093/humrep/deac028.
10
Design considerations for genetic linkage and association studies.基因连锁与关联研究的设计考量
Methods Mol Biol. 2012;850:237-62. doi: 10.1007/978-1-61779-555-8_13.

引用本文的文献

1
Multiomics analyses of the complex interplay between genetic variants, DNA methylation, and gene expression in COVID-19.新冠病毒中基因变异、DNA甲基化和基因表达之间复杂相互作用的多组学分析。
Am J Physiol Heart Circ Physiol. 2025 Aug 1;329(2):H412-H422. doi: 10.1152/ajpheart.00206.2025. Epub 2025 Jul 1.

本文引用的文献

1
The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.NHGRI-EBI GWAS Catalog 于 2019 年发布的已发表全基因组关联研究、靶向基因芯片和汇总统计数据
Nucleic Acids Res. 2019 Jan 8;47(D1):D1005-D1012. doi: 10.1093/nar/gky1120.
2
Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals.从一项涉及 110 万人的教育程度全基因组关联研究中发现基因并进行多基因预测。
Nat Genet. 2018 Jul 23;50(8):1112-1121. doi: 10.1038/s41588-018-0147-3.
3
An Expanded View of Complex Traits: From Polygenic to Omnigenic.
复杂性状的扩展观点:从多基因到泛基因
Cell. 2017 Jun 15;169(7):1177-1186. doi: 10.1016/j.cell.2017.05.038.
4
Weighting sequence variants based on their annotation increases power of whole-genome association studies.基于注释对序列变异进行加权可提高全基因组关联研究的效能。
Nat Genet. 2016 Mar;48(3):314-7. doi: 10.1038/ng.3507. Epub 2016 Feb 8.
5
Identification of neutral tumor evolution across cancer types.跨癌症类型的中性肿瘤进化识别。
Nat Genet. 2016 Mar;48(3):238-244. doi: 10.1038/ng.3489. Epub 2016 Jan 18.
6
Joint analysis of functional genomic data and genome-wide association studies of 18 human traits.18 个人类特征的功能基因组数据和全基因组关联研究的联合分析。
Am J Hum Genet. 2014 Apr 3;94(4):559-73. doi: 10.1016/j.ajhg.2014.03.004.
7
Fast association tests for genes with FAST.快速关联测试基因与 FAST。
PLoS One. 2013 Jul 23;8(7):e68585. doi: 10.1371/journal.pone.0068585. Print 2013.
8
Incorporating prior knowledge to increase the power of genome-wide association studies.整合先验知识以增强全基因组关联研究的效能。
Methods Mol Biol. 2013;1019:519-41. doi: 10.1007/978-1-62703-447-0_25.
9
All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs.并非所有单核苷酸多态性(SNP)都是一样的:全基因组关联研究揭示了功能注释SNP中存在一致的富集模式。
PLoS Genet. 2013 Apr;9(4):e1003449. doi: 10.1371/journal.pgen.1003449. Epub 2013 Apr 25.
10
Gene-based tests of association.基于基因的关联测试。
PLoS Genet. 2011 Jul;7(7):e1002177. doi: 10.1371/journal.pgen.1002177. Epub 2011 Jul 28.