• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

由于样本重叠和相关性导致的多基因风险评分膨胀:一个主要偏倚风险的例子。

Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.

机构信息

Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.

Epilepsy Research Centre, Department of Medicine, University of Melbourne, Austin Health, Heidelberg, VIC 3084, Australia; Population Health and Immunity Division, the Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; Department of Medical Biology, University of Melbourne, Melbourne, VIC 3052, Australia.

出版信息

Am J Hum Genet. 2024 Sep 5;111(9):1805-1809. doi: 10.1016/j.ajhg.2024.07.014. Epub 2024 Aug 20.

DOI:10.1016/j.ajhg.2024.07.014
PMID:39168121
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11393675/
Abstract

Polygenic risk scores (PRSs) are an important tool for understanding the role of common genetic variants in human disease. Standard best practices recommend that PRSs be analyzed in cohorts that are independent of the genome-wide association study (GWAS) used to derive the scores without sample overlap or relatedness between the two cohorts. However, identifying sample overlap and relatedness can be challenging in an era of GWASs performed by large biobanks and international research consortia. Although most genomics researchers are aware of best practices and theoretical concerns about sample overlap and relatedness between GWAS and PRS cohorts, the prevailing assumption is that the risk of bias is small for very large GWASs. Here, we present two real-world examples demonstrating that sample overlap and relatedness is not a minor or theoretical concern but an important potential source of bias in PRS studies. Using a recently developed statistical adjustment tool, we found that excluding overlapping and related samples was equal to or more powerful than adjusting for overlap bias. Our goal is to make genomics researchers aware of the magnitude of risk of bias from sample overlap and relatedness and to highlight the need for mitigation tools, including independent validation cohorts in PRS studies, continued development of statistical adjustment methods, and tools for researchers to test their cohorts for overlap and relatedness with GWAS cohorts without sharing individual-level data.

摘要

多基因风险评分 (PRSs) 是理解常见遗传变异在人类疾病中的作用的重要工具。标准最佳实践建议,在与用于推导评分的全基因组关联研究 (GWAS) 无样本重叠或相关性的独立队列中分析 PRS。然而,在由大型生物库和国际研究联盟进行的 GWAS 时代,识别样本重叠和相关性可能具有挑战性。尽管大多数基因组学研究人员都意识到最佳实践以及关于 GWAS 和 PRS 队列之间样本重叠和相关性的理论问题,但普遍的假设是,对于非常大的 GWAS,偏倚风险很小。在这里,我们提出了两个实际示例,证明样本重叠和相关性不是一个次要的或理论上的问题,而是 PRS 研究中一个重要的潜在偏倚来源。使用最近开发的统计调整工具,我们发现排除重叠和相关样本与调整重叠偏倚一样有效或更有效。我们的目标是让基因组学研究人员意识到样本重叠和相关性带来的偏倚风险的程度,并强调需要缓解工具,包括在 PRS 研究中使用独立验证队列、继续开发统计调整方法,以及为研究人员提供无需共享个人层面数据即可测试其队列与 GWAS 队列之间重叠和相关性的工具。

相似文献

1
Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.由于样本重叠和相关性导致的多基因风险评分膨胀:一个主要偏倚风险的例子。
Am J Hum Genet. 2024 Sep 5;111(9):1805-1809. doi: 10.1016/j.ajhg.2024.07.014. Epub 2024 Aug 20.
2
Addressing overfitting bias due to sample overlap in polygenic risk scoring.解决多基因风险评分中由于样本重叠导致的过拟合偏差问题。
Alzheimers Dement. 2025 Apr;21(4):e70109. doi: 10.1002/alz.70109.
3
EraSOR: a software tool to eliminate inflation caused by sample overlap in polygenic score analyses.EraSOR:一种用于消除多基因评分分析中因样本重叠引起的膨胀的软件工具。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad043. Epub 2023 Jun 16.
4
Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities.将多基因风险评分方法应用于药物基因组学全基因组关联研究:挑战与机遇
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad470.
5
Genome-Wide Association Study for Resting Electrocardiogram in the Qatari Population Identifies 6 Novel Genes and Validates Novel Polygenic Risk Scores.卡塔尔人群静息心电图的全基因组关联研究发现6个新基因并验证了新的多基因风险评分。
J Am Heart Assoc. 2025 Mar 4;14(5):e038341. doi: 10.1161/JAHA.124.038341. Epub 2025 Feb 26.
6
Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction.利用个体水平的遗传数据和 GWAS 汇总统计数据可以提高多基因预测。
Am J Hum Genet. 2021 Jun 3;108(6):1001-1011. doi: 10.1016/j.ajhg.2021.04.014. Epub 2021 May 7.
7
Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks.癌症 PRSweb:一个具有主要癌症特征多基因风险评分的在线知识库及其在两个独立生物库中的评估。
Am J Hum Genet. 2020 Nov 5;107(5):815-836. doi: 10.1016/j.ajhg.2020.08.025. Epub 2020 Sep 28.
8
Development and validation of genome-wide polygenic risk scores for predicting breast cancer incidence in Japanese females: a population-based case-cohort study.基于人群的病例-对照研究:开发和验证用于预测日本女性乳腺癌发病风险的全基因组多基因风险评分。
Breast Cancer Res Treat. 2023 Feb;197(3):661-671. doi: 10.1007/s10549-022-06843-6. Epub 2022 Dec 20.
9
Psychiatric Polygenic Risk Scores Across Youth With Bipolar Disorder, Youth at High Risk for Bipolar Disorder, and Controls.双相情感障碍青少年、双相情感障碍高危青少年及对照组的精神科多基因风险评分
J Am Acad Child Adolesc Psychiatry. 2024 Nov;63(11):1149-1157. doi: 10.1016/j.jaac.2023.12.009. Epub 2024 Feb 8.
10
Predictive Power of Polygenic Risk Scores for Intraocular Pressure or Vertical Cup-Disc Ratio.多基因风险评分对眼压或垂直杯盘比的预测能力。
JAMA Ophthalmol. 2025 Jan 1;143(1):15-22. doi: 10.1001/jamaophthalmol.2024.4856.

引用本文的文献

1
Predicting the direction of phenotypic difference.预测表型差异的方向。
Nat Commun. 2025 Jul 26;16(1):6898. doi: 10.1038/s41467-025-62355-z.
2
Phenotypic heterogeneity in familial epilepsies is influenced by polygenic risk for generalized and focal epilepsies.家族性癫痫的表型异质性受全身性癫痫和局灶性癫痫的多基因风险影响。
Epilepsia. 2025 Mar 6. doi: 10.1111/epi.18348.
3
Investigating the effect of polygenic background on epilepsy phenotype in 'monogenic' families.探讨多基因背景对“单基因”家族中癫痫表型的影响。
EBioMedicine. 2024 Nov;109:105404. doi: 10.1016/j.ebiom.2024.105404. Epub 2024 Oct 30.
4
The Role of Genomic-Informed Risk Assessments in Predicting Dementia Outcomes.基因组信息风险评估在预测痴呆症结局中的作用。
medRxiv. 2024 Apr 30:2024.04.27.24306488. doi: 10.1101/2024.04.27.24306488.

本文引用的文献

1
GWAS meta-analysis of over 29,000 people with epilepsy identifies 26 risk loci and subtype-specific genetic architecture.GWAS 荟萃分析超过 29000 名癫痫患者,确定了 26 个风险基因座和亚型特异性遗传结构。
Nat Genet. 2023 Sep;55(9):1471-1482. doi: 10.1038/s41588-023-01485-w. Epub 2023 Aug 31.
2
Familial Mesial Temporal Lobe Epilepsy: Clinical Spectrum and Genetic Evidence for a Polygenic Architecture.家族性内侧颞叶癫痫:临床谱及多基因结构的遗传证据。
Ann Neurol. 2023 Nov;94(5):825-835. doi: 10.1002/ana.26765. Epub 2023 Aug 31.
3
EraSOR: a software tool to eliminate inflation caused by sample overlap in polygenic score analyses.EraSOR:一种用于消除多基因评分分析中因样本重叠引起的膨胀的软件工具。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad043. Epub 2023 Jun 16.
4
The role of common genetic variation in presumed monogenic epilepsies.常见遗传变异在假定的单基因癫痫中的作用。
EBioMedicine. 2022 Jul;81:104098. doi: 10.1016/j.ebiom.2022.104098. Epub 2022 Jun 6.
5
Common risk variants for epilepsy are enriched in families previously targeted for rare monogenic variant discovery.常见的癫痫风险变异在之前针对罕见单基因变异发现的家族中更为丰富。
EBioMedicine. 2022 Jul;81:104079. doi: 10.1016/j.ebiom.2022.104079. Epub 2022 May 27.
6
Discovery and implications of polygenicity of common diseases.常见疾病多基因遗传的发现及其意义。
Science. 2021 Sep 24;373(6562):1468-1473. doi: 10.1126/science.abi8206. Epub 2021 Sep 23.
7
Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions.多基因背景会影响一级基因组条件中单基因变异的外显率。
Nat Commun. 2020 Aug 20;11(1):3635. doi: 10.1038/s41467-020-17374-3.
8
Tutorial: a guide to performing polygenic risk score analyses.教程:多基因风险评分分析操作指南。
Nat Protoc. 2020 Sep;15(9):2759-2772. doi: 10.1038/s41596-020-0353-1. Epub 2020 Jul 24.
9
Polygenic burden in focal and generalized epilepsies.多基因负担在局灶性和全面性癫痫中的作用。
Brain. 2019 Nov 1;142(11):3473-3481. doi: 10.1093/brain/awz292.
10
PRSice-2: Polygenic Risk Score software for biobank-scale data.PRSice-2:用于生物库规模数据的多基因风险评分软件。
Gigascience. 2019 Jul 1;8(7). doi: 10.1093/gigascience/giz082.