在存在混杂因素的情况下对多基因分数差异进行检测。

Testing for differences in polygenic scores in the presence of confounding.

作者信息

Blanc Jennifer, Berg Jeremy J

机构信息

Department of Human Genetics, University of Chicago, 920 E 58th St CLSC, Chicago, IL 60637, USA.

出版信息

Genetics. 2025 Jun 4;230(2). doi: 10.1093/genetics/iyaf071.

DOI:10.1093/genetics/iyaf071

PMID:40233174

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12135188/

Abstract

Polygenic scores have become an important tool in human genetics, enabling the prediction of individuals' phenotypes from their genotypes. Understanding how the pattern of differences in polygenic score predictions across individuals intersects with variation in ancestry can provide insights into the evolutionary forces acting on the trait in question and is important for understanding health disparities. However, because most polygenic scores are computed using effect estimates from population samples, they are susceptible to confounding by both genetic and environmental effects that are correlated with ancestry. The extent to which this confounding drives patterns in the distribution of polygenic scores depends on the patterns of population structure in both the original estimation panel and in the prediction/test panel. Here, we use theory from population and statistical genetics, together with simulations, to study the procedure of testing for an association between polygenic scores and axes of ancestry variation in the presence of confounding. We use a general model of genetic relatedness to describe how confounding in the estimation panel biases the distribution of polygenic scores in ways that depends on the degree of overlap in population structure between panels. We then show how this confounding can bias tests for associations between polygenic scores and important axes of ancestry variation in the test panel. Specifically, for any given test, there exists a single axis of population structure in the genome-wide association study (GWAS) panel that needs to be controlled for in order to protect the test. In the context of this result, we study the behavior of multiple approaches to control for stratification along this axis, including standard methods such using principal components as fixed covariates in the GWAS, linear mixed models, and a novel approach for directly estimating the axis using the test panel genotypes. Our analyses highlight the role of estimation noise in the models of population structure as a plausible source of residual confounding in polygenic score analyses.

摘要

多基因评分已成为人类遗传学中的一项重要工具，能够根据个体的基因型预测其表型。了解多基因评分预测在个体间的差异模式如何与祖先差异相互交织，有助于深入了解影响相关性状的进化力量，对于理解健康差异也至关重要。然而，由于大多数多基因评分是使用来自人群样本的效应估计值计算得出的，它们容易受到与祖先相关的遗传和环境效应的混杂影响。这种混杂对多基因评分分布模式的影响程度取决于原始估计面板和预测/测试面板中的人群结构模式。在此，我们运用群体遗传学和统计遗传学理论，并结合模拟，研究在存在混杂因素的情况下，检验多基因评分与祖先差异轴之间关联的过程。我们使用遗传相关性的一般模型来描述估计面板中的混杂如何以取决于面板间人群结构重叠程度的方式使多基因评分的分布产生偏差。然后我们展示这种混杂如何使测试面板中多基因评分与重要祖先差异轴之间的关联检验产生偏差。具体而言，对于任何给定的检验，在全基因组关联研究（GWAS）面板中存在一个单一的人群结构轴，为了保护检验需要对其进行控制。在此结果的背景下，我们研究了多种控制沿此轴分层的方法的行为，包括标准方法，如在GWAS中使用主成分作为固定协变量、线性混合模型，以及一种使用测试面板基因型直接估计该轴的新方法。我们的分析强调了群体结构模型中估计噪声作为多基因评分分析中残余混杂的一个合理来源的作用。

相似文献

Testing for differences in polygenic scores in the presence of confounding.在存在混杂因素的情况下对多基因分数差异进行检测。

Genetics. 2025 Jun 4;230(2). doi: 10.1093/genetics/iyaf071.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Testing for differences in polygenic scores in the presence of confounding.在存在混杂因素的情况下对多基因分数差异进行检测。

bioRxiv. 2024 Jun 26:2023.03.12.532301. doi: 10.1101/2023.03.12.532301.

The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历：系统检索与综述

Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.

Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.性虐待和暴力的心理社会干预的幸存者、家庭和专业人员的经验：定性证据综合。

Cochrane Database Syst Rev. 2022 Oct 4;10(10):CD013648. doi: 10.1002/14651858.CD013648.pub2.

A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。

Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Audit and feedback: effects on professional practice.审核与反馈：对专业实践的影响

Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.

Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。

Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

引用本文的文献

A Litmus Test for Confounding in Polygenic Scores.多基因评分中混杂因素的石蕊试验

bioRxiv. 2025 Feb 4:2025.02.01.635985. doi: 10.1101/2025.02.01.635985.

本文引用的文献

Pandora: a tool to estimate dimensionality reduction stability of genotype data.潘多拉：一种评估基因型数据降维稳定性的工具。

Bioinform Adv. 2025 Mar 3;5(1):vbaf040. doi: 10.1093/bioadv/vbaf040. eCollection 2025.

Fine-scale population structure and widespread conservation of genetic effect sizes between human groups across traits.人类群体间跨性状的精细尺度种群结构及遗传效应大小的广泛保守性。

Nat Genet. 2025 Feb;57(2):379-389. doi: 10.1038/s41588-024-02035-8. Epub 2025 Feb 3.

Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations.从统计遗传学和系统发生学的角度出发，为结构群体中的表型映射建立统一方法。

PLoS Biol. 2024 Oct 9;22(10):e3002847. doi: 10.1371/journal.pbio.3002847. eCollection 2024 Oct.

Socio-cultural practices may have affected sex differences in stature in Early Neolithic Europe.社会文化习俗可能影响了新石器时代早期欧洲身高的性别差异。

Nat Hum Behav. 2024 Feb;8(2):243-255. doi: 10.1038/s41562-023-01756-w. Epub 2023 Dec 11.

Polygenic scoring accuracy varies across the genetic ancestry continuum.多基因评分准确性在遗传祖先连续体上有所差异。

Nature. 2023 Jun;618(7966):774-781. doi: 10.1038/s41586-023-06079-4. Epub 2023 May 17.

Gene-environment correlations across geographic regions affect genome-wide association studies.基因-环境相关性在地理区域上的差异会影响全基因组关联研究。

Nat Genet. 2022 Sep;54(9):1345-1354. doi: 10.1038/s41588-022-01158-0. Epub 2022 Aug 22.

Polygenic score accuracy in ancient samples: Quantifying the effects of allelic turnover.多基因评分在古代样本中的准确性：量化等位基因替换的影响。

PLoS Genet. 2022 May 6;18(5):e1010170. doi: 10.1371/journal.pgen.1010170. eCollection 2022 May.

Population differentiation of polygenic score predictions under stabilizing selection.稳定选择下多基因评分预测的群体分化。

Philos Trans R Soc Lond B Biol Sci. 2022 Jun 6;377(1852):20200416. doi: 10.1098/rstb.2020.0416. Epub 2022 Apr 18.

A selection pressure landscape for 870 human polygenic traits.870种人类多基因性状的选择压力格局

Nat Hum Behav. 2021 Dec;5(12):1731-1743. doi: 10.1038/s41562-021-01231-4. Epub 2021 Nov 15.

The evolution of group differences in changing environments.群体差异在变化环境中的演变。

PLoS Biol. 2021 Jan 25;19(1):e3001072. doi: 10.1371/journal.pbio.3001072. eCollection 2021 Jan.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验