• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在多基因多效性信息框架下,对全基因组关联研究中的样本重叠进行校正。

A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework.

机构信息

Oslo Centre for Biostatistics and Epidemiology, Oslo University Hospital, Oslo universitetssykehus HF, Sogn Arena, PB 4950 Nydalen, Oslo, 0424, Norway.

MRC Biostatistics Unit, University of Cambridge, MRC Biostatistics Unit, Cambridge Institute of Public Health, Robinson Way, Cambridge, CB2 0SR, United Kingdom.

出版信息

BMC Genomics. 2018 Jun 25;19(1):494. doi: 10.1186/s12864-018-4859-7.

DOI:10.1186/s12864-018-4859-7
PMID:29940862
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6019513/
Abstract

BACKGROUND

There is considerable evidence that many complex traits have a partially shared genetic basis, termed pleiotropy. It is therefore useful to consider integrating genome-wide association study (GWAS) data across several traits, usually at the summary statistic level. A major practical challenge arises when these GWAS have overlapping subjects. This is particularly an issue when estimating pleiotropy using methods that condition the significance of one trait on the signficance of a second, such as the covariate-modulated false discovery rate (cmfdr).

RESULTS

We propose a method for correcting for sample overlap at the summary statistic level. We quantify the expected amount of spurious correlation between the summary statistics from two GWAS due to sample overlap, and use this estimated correlation in a simple linear correction that adjusts the joint distribution of test statistics from the two GWAS. The correction is appropriate for GWAS with case-control or quantitative outcomes. Our simulations and data example show that without correcting for sample overlap, the cmfdr is not properly controlled, leading to an excessive number of false discoveries and an excessive false discovery proportion. Our correction for sample overlap is effective in that it restores proper control of the false discovery rate, at very little loss in power.

CONCLUSIONS

With our proposed correction, it is possible to integrate GWAS summary statistics with overlapping samples in a statistical framework that is dependent on the joint distribution of the two GWAS.

摘要

背景

有相当多的证据表明,许多复杂的特征具有部分共同的遗传基础,称为多效性。因此,考虑整合多个特征的全基因组关联研究(GWAS)数据是很有用的,通常是在汇总统计数据水平上。当这些 GWAS 具有重叠的研究对象时,就会出现一个主要的实际挑战。当使用条件显著的方法估计多效性时,例如协变量调制的错误发现率(cmfdr),这尤其成问题。

结果

我们提出了一种在汇总统计数据水平上校正样本重叠的方法。我们量化了由于样本重叠而导致两个 GWAS 汇总统计数据之间虚假相关性的预期数量,并在简单的线性校正中使用此估计相关性,该校正调整了两个 GWAS 的测试统计数据的联合分布。该校正适用于病例对照或定量结局的 GWAS。我们的模拟和数据示例表明,如果不校正样本重叠,cmfdr 就无法得到适当控制,导致大量的假发现和过高的假发现比例。我们的样本重叠校正有效地恢复了错误发现率的适当控制,而几乎没有损失功效。

结论

通过我们提出的校正方法,可以在一个依赖于两个 GWAS 的联合分布的统计框架中整合具有重叠样本的 GWAS 汇总统计数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/a86b330d9052/12864_2018_4859_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/0f9d55c66d30/12864_2018_4859_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/d04cf93ef3ff/12864_2018_4859_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/a86b330d9052/12864_2018_4859_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/0f9d55c66d30/12864_2018_4859_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/d04cf93ef3ff/12864_2018_4859_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9093/6019513/a86b330d9052/12864_2018_4859_Fig3_HTML.jpg

相似文献

1
A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework.在多基因多效性信息框架下,对全基因组关联研究中的样本重叠进行校正。
BMC Genomics. 2018 Jun 25;19(1):494. doi: 10.1186/s12864-018-4859-7.
2
Pleiotropy informed adaptive association test of multiple traits using genome-wide association study summary data.利用全基因组关联研究汇总数据进行多性状的多效性知情适应性关联测试。
Biometrics. 2019 Dec;75(4):1076-1085. doi: 10.1111/biom.13076. Epub 2019 Aug 2.
3
Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.利用 GWAS 汇总统计数据进行边缘和条件分析测试遗传 pleiotropy。
Genetics. 2017 Dec;207(4):1285-1299. doi: 10.1534/genetics.117.300347. Epub 2017 Oct 2.
4
Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics.利用 GWAS 汇总统计数据估计复杂性状遗传相关性的方法比较。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa442.
5
Methods for meta-analysis of multiple traits using GWAS summary statistics.使用全基因组关联研究(GWAS)汇总统计量进行多性状荟萃分析的方法。
Genet Epidemiol. 2018 Mar;42(2):134-145. doi: 10.1002/gepi.22105. Epub 2017 Dec 10.
6
Covariate-modulated local false discovery rate for genome-wide association studies.基于协变量的全基因组关联研究的局部假发现率。
Bioinformatics. 2014 Aug 1;30(15):2098-104. doi: 10.1093/bioinformatics/btu145. Epub 2014 Apr 7.
7
A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics.一种适应共享对照设计的多效性信息贝叶斯错误发现率可从全基因组关联研究汇总统计数据中发现新的疾病关联。
PLoS Genet. 2015 Feb 6;11(2):e1004926. doi: 10.1371/journal.pgen.1004926. eCollection 2015 Feb.
8
Investigation of multi-trait associations using pathway-based analysis of GWAS summary statistics.基于 GWAS 汇总统计数据的通路分析探究多性状关联。
BMC Genomics. 2019 Feb 4;20(Suppl 1):79. doi: 10.1186/s12864-018-5373-7.
9
PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics.PhenoSpD:一个整合的工具包,用于使用 GWAS 汇总统计数据进行表型相关性估计和多重检验校正。
Gigascience. 2018 Aug 1;7(8):giy090. doi: 10.1093/gigascience/giy090.
10
Accurate detection of shared genetic architecture from GWAS summary statistics in the small-sample context.在小样本情况下,从 GWAS 汇总统计数据中准确检测共享遗传结构。
PLoS Genet. 2023 Aug 16;19(8):e1010852. doi: 10.1371/journal.pgen.1010852. eCollection 2023 Aug.

引用本文的文献

1
Combining xQTL and genome-wide association studies from ethnically diverse populations improves druggable gene discovery.将来自不同种族人群的全基因组关联研究与全基因组转录定量位点分析相结合,可改善可成药基因的发现。
Res Sq. 2025 May 28:rs.3.rs-6700169. doi: 10.21203/rs.3.rs-6700169/v1.
2
Addressing overfitting bias due to sample overlap in polygenic risk scoring.解决多基因风险评分中由于样本重叠导致的过拟合偏差问题。
Alzheimers Dement. 2025 Apr;21(4):e70109. doi: 10.1002/alz.70109.
3
Bayesian estimation of shared polygenicity identifies drug targets and repurposable medicines for human complex diseases.

本文引用的文献

1
Bias due to participant overlap in two-sample Mendelian randomization.两样本孟德尔随机化中由于参与者重叠导致的偏倚。
Genet Epidemiol. 2016 Nov;40(7):597-608. doi: 10.1002/gepi.21998. Epub 2016 Sep 14.
2
Across-cohort QC analyses of GWAS summary statistics from complex traits.复杂性状全基因组关联研究汇总统计数据的跨队列质量控制分析。
Eur J Hum Genet. 2016 Jan;25(1):137-146. doi: 10.1038/ejhg.2016.106. Epub 2016 Aug 24.
3
A general framework for meta-analyzing dependent studies with overlapping subjects in association mapping.关联映射中对具有重叠样本的相关研究进行荟萃分析的通用框架。
共享多基因性的贝叶斯估计确定了人类复杂疾病的药物靶点和可重新利用的药物。
medRxiv. 2025 Mar 17:2025.03.17.25324106. doi: 10.1101/2025.03.17.25324106.
4
Comparison of methods for building polygenic scores for diverse populations.不同人群多基因评分构建方法的比较。
HGG Adv. 2025 Jan 9;6(1):100355. doi: 10.1016/j.xhgg.2024.100355. Epub 2024 Sep 25.
5
Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.由于样本重叠和相关性导致的多基因风险评分膨胀:一个主要偏倚风险的例子。
Am J Hum Genet. 2024 Sep 5;111(9):1805-1809. doi: 10.1016/j.ajhg.2024.07.014. Epub 2024 Aug 20.
6
Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes.两阶段策略使用去噪自动编码器实现稳健的无参考基因型缺失输入基因型的基因型推断。
J Hum Genet. 2024 Oct;69(10):511-518. doi: 10.1038/s10038-024-01261-6. Epub 2024 Jun 25.
7
Epistasis and pleiotropy-induced variation for plant breeding.上位性和多效性引起的植物育种变异。
Plant Biotechnol J. 2024 Oct;22(10):2788-2807. doi: 10.1111/pbi.14405. Epub 2024 Jun 14.
8
A cross-trait study of lung cancer and its related respiratory diseases based on large-scale exome sequencing population.基于大规模外显子测序人群的肺癌及其相关呼吸系统疾病的跨性状研究。
Transl Lung Cancer Res. 2024 Mar 29;13(3):512-525. doi: 10.21037/tlcr-24-4. Epub 2024 Mar 14.
9
simmrd: An open-source tool to perform simulations in Mendelian randomization.simmrd:用于孟德尔随机化模拟的开源工具。
Genet Epidemiol. 2024 Mar;48(2):59-73. doi: 10.1002/gepi.22544. Epub 2024 Jan 23.
10
The genetic architecture of the human hypothalamus and its involvement in neuropsychiatric behaviours and disorders.人类下丘脑的遗传结构及其在神经精神行为和障碍中的作用。
Nat Hum Behav. 2024 Apr;8(4):779-793. doi: 10.1038/s41562-023-01792-6. Epub 2024 Jan 5.
Hum Mol Genet. 2016 May 1;25(9):1857-66. doi: 10.1093/hmg/ddw049. Epub 2016 Feb 21.
4
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
5
Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index.利用插补变异进行遗传方差估计发现,人类身高和体重指数的缺失遗传力可忽略不计。
Nat Genet. 2015 Oct;47(10):1114-20. doi: 10.1038/ng.3390. Epub 2015 Aug 31.
6
A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics.一种适应共享对照设计的多效性信息贝叶斯错误发现率可从全基因组关联研究汇总统计数据中发现新的疾病关联。
PLoS Genet. 2015 Feb 6;11(2):e1004926. doi: 10.1371/journal.pgen.1004926. eCollection 2015 Feb.
7
Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension.基于 GWASs 的汇总统计数据进行相关性状的荟萃分析及其在高血压中的应用。
Am J Hum Genet. 2015 Jan 8;96(1):21-36. doi: 10.1016/j.ajhg.2014.11.011. Epub 2014 Dec 11.
8
GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.GPA:一种通过整合多效性和注释对全基因组关联研究结果进行优先级排序的统计方法。
PLoS Genet. 2014 Nov 13;10(11):e1004787. doi: 10.1371/journal.pgen.1004787. eCollection 2014 Nov.
9
Biological insights from 108 schizophrenia-associated genetic loci.108 个精神分裂症相关遗传位点的生物学见解。
Nature. 2014 Jul 24;511(7510):421-7. doi: 10.1038/nature13595. Epub 2014 Jul 22.
10
Shared common variants in prostate cancer and blood lipids.前列腺癌与血脂中的共享常见变异
Int J Epidemiol. 2014 Aug;43(4):1205-14. doi: 10.1093/ije/dyu090. Epub 2014 Apr 30.