• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SPC:一种用于解决基因组分析中近期群体结构问题的光谱成分方法。

SPC: a SPectral Component approach to address recent population structure in genomic analysis.

作者信息

Shemirani Ruhollah, Belbin Gillian M, Cullina Sinead, Caggiano Christa, Gignoux Christopher, Zaitlen Noah, Kenny Eimear E

机构信息

Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

出版信息

medRxiv. 2025 Jun 5:2025.06.04.25328990. doi: 10.1101/2025.06.04.25328990.

DOI:10.1101/2025.06.04.25328990
PMID:40502608
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12155035/
Abstract

Population structure is a well-known confounder in statistical genetics, particularly in genome-wide association studies (GWAS), where it can lead to inflated test statistics and spurious associations. Traditional methods, such as principal components (PCs), commonly used to adjust for population structure, are limited in capturing fine-scale, non-linear patterns that arise from recent demographic events - patterns that are crucial for understanding rare variant effects. To address this challenge, we propose a novel method called SPectral Components (SPCs), which leverages identity-by-descent (IBD) graphs to capture and transform local, non-linear fine-scale population structure into continuous representations that can be seamlessly integrated into genetic analysis pipelines. Using both simulated datasets and empirical data from the UK Biobank (N ≈ 420,000), we demonstrate that SPCs outperform PCs in adjusting for fine-scale population structure. In simulations, SPCs explained over 90% of the fine-scale population structure with fewer components, while PCs captured less than 50%. In the UK Biobank, SPCs reduced the inflation of p-values in the GWAS of an environmental-driven phenotype by 12% compared to PCs, while maintaining a similar performance to PCs in height, a highly heritable phenotype. Additionally, SPCs improved rare variant association analyses, reducing genomic inflation (e.g., from 7.6 to 1.2 in one analysis), and provided more accurate heritability estimates. Spatial autocorrelation analysis further confirmed the ability of SPCs to account for environmental effects, reducing Moran's I for both environmental and heritable phenotypes more effectively than PCs. Overall, our findings demonstrate that SPCs provide a robust, scalable adjustment for recent population structure, offering a powerful alternative or complement to PCs in large-scale biobank studies.

摘要

群体结构是统计遗传学中一个众所周知的混杂因素,尤其是在全基因组关联研究(GWAS)中,它可能导致检验统计量膨胀和虚假关联。传统方法,如主成分(PC),通常用于调整群体结构,但在捕捉由近期人口事件产生的精细尺度、非线性模式方面存在局限性,而这些模式对于理解罕见变异效应至关重要。为应对这一挑战,我们提出了一种名为谱成分(SPC)的新方法,该方法利用同源片段(IBD)图来捕捉局部非线性精细尺度群体结构,并将其转化为可无缝集成到遗传分析流程中的连续表示形式。使用模拟数据集和来自英国生物银行(N≈420,000)的实证数据,我们证明SPC在调整精细尺度群体结构方面优于PC。在模拟中,SPC用更少的成分解释了超过90%的精细尺度群体结构,而PC捕获的不到50%。在英国生物银行中,与PC相比,SPC在环境驱动表型的GWAS中使p值的膨胀降低了12%,同时在高度这一高度可遗传表型上保持了与PC相似的性能。此外,SPC改进了罕见变异关联分析,减少了基因组膨胀(例如,在一次分析中从7.6降至1.2),并提供了更准确的遗传力估计。空间自相关分析进一步证实了SPC解释环境效应的能力,比PC更有效地降低了环境和可遗传表型的莫兰指数I。总体而言,我们的研究结果表明,SPC为近期群体结构提供了一种稳健、可扩展的调整方法,在大规模生物银行研究中为PC提供了有力的替代或补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/2d34a5fab497/nihpp-2025.06.04.25328990v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/c50ccde8b22b/nihpp-2025.06.04.25328990v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/918f077376fd/nihpp-2025.06.04.25328990v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/23267a750ee7/nihpp-2025.06.04.25328990v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/2d34a5fab497/nihpp-2025.06.04.25328990v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/c50ccde8b22b/nihpp-2025.06.04.25328990v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/918f077376fd/nihpp-2025.06.04.25328990v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/23267a750ee7/nihpp-2025.06.04.25328990v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1941/12155035/2d34a5fab497/nihpp-2025.06.04.25328990v1-f0004.jpg

相似文献

1
SPC: a SPectral Component approach to address recent population structure in genomic analysis.SPC:一种用于解决基因组分析中近期群体结构问题的光谱成分方法。
medRxiv. 2025 Jun 5:2025.06.04.25328990. doi: 10.1101/2025.06.04.25328990.
2
Impact of residual disease as a prognostic factor for survival in women with advanced epithelial ovarian cancer after primary surgery.原发性手术后晚期上皮性卵巢癌患者残留病灶对生存预后的影响。
Cochrane Database Syst Rev. 2022 Sep 26;9(9):CD015048. doi: 10.1002/14651858.CD015048.pub2.
3
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
4
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。
Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.
5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
6
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
7
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
8
Comparison of cellulose, modified cellulose and synthetic membranes in the haemodialysis of patients with end-stage renal disease.纤维素、改性纤维素和合成膜在终末期肾病患者血液透析中的比较。
Cochrane Database Syst Rev. 2001(3):CD003234. doi: 10.1002/14651858.CD003234.
9
Immunogenicity and seroefficacy of pneumococcal conjugate vaccines: a systematic review and network meta-analysis.肺炎球菌结合疫苗的免疫原性和血清效力:系统评价和网络荟萃分析。
Health Technol Assess. 2024 Jul;28(34):1-109. doi: 10.3310/YWHA3079.
10
Surveillance of Barrett's oesophagus: exploring the uncertainty through systematic review, expert workshop and economic modelling.巴雷特食管的监测:通过系统评价、专家研讨会和经济模型探索不确定性
Health Technol Assess. 2006 Mar;10(8):1-142, iii-iv. doi: 10.3310/hta10080.

本文引用的文献

1
Biobank-scale inference of multi-individual identity by descent and gene conversion.基于个体血缘关系和基因转换的生物银行规模个体推断。
Am J Hum Genet. 2024 Apr 4;111(4):691-700. doi: 10.1016/j.ajhg.2024.02.015. Epub 2024 Mar 20.
2
Disease risk and healthcare utilization among ancestrally diverse groups in the Los Angeles region.洛杉矶地区多种族群体的疾病风险和医疗保健利用情况。
Nat Med. 2023 Jul;29(7):1845-1856. doi: 10.1038/s41591-023-02425-1. Epub 2023 Jul 18.
3
Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits.
生物银行规模下的祖先重组图推断实现了复杂性状的系谱分析。
Nat Genet. 2023 May;55(5):768-776. doi: 10.1038/s41588-023-01379-x. Epub 2023 May 1.
4
Genome-wide analysis identifies genetic effects on reproductive success and ongoing natural selection at the FADS locus.全基因组分析鉴定了 FADS 基因座对生殖成功和正在进行的自然选择的遗传效应。
Nat Hum Behav. 2023 May;7(5):790-801. doi: 10.1038/s41562-023-01528-6. Epub 2023 Mar 2.
5
Cross-trait assortative mating is widespread and inflates genetic correlation estimates.跨性状同型交配普遍存在,并夸大了遗传相关估计值。
Science. 2022 Nov 18;378(6621):754-761. doi: 10.1126/science.abo2059. Epub 2022 Nov 17.
6
Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS Community Health Initiative.利用电子健康记录关联生物库中的基因组多样性进行发现:加州大学洛杉矶分校 ATLAS 社区健康倡议。
Genome Med. 2022 Sep 9;14(1):104. doi: 10.1186/s13073-022-01106-x.
7
Revealing the recent demographic history of Europe via haplotype sharing in the UK Biobank.通过在英国生物银行中的单倍型共享揭示欧洲近期的人口历史。
Proc Natl Acad Sci U S A. 2022 Jun 21;119(25):e2119281119. doi: 10.1073/pnas.2119281119. Epub 2022 Jun 13.
8
Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data.评估全基因组序列数据中稀有变异对复杂性状遗传度的贡献。
Nat Genet. 2022 Mar;54(3):263-273. doi: 10.1038/s41588-021-00997-7. Epub 2022 Mar 7.
9
Efficient ancestry and mutation simulation with msprime 1.0.利用 msprime 1.0 进行高效的祖先和突变模拟。
Genetics. 2022 Mar 3;220(3). doi: 10.1093/genetics/iyab229.
10
Rapid detection of identity-by-descent tracts for mega-scale datasets.大规模数据集的同源片段快速检测
Nat Commun. 2021 Jun 10;12(1):3546. doi: 10.1038/s41467-021-22910-w.