• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过混合模型在大规模遗传汇总统计中表征子结构。

Characterizing substructure via mixture modeling in large-scale genetic summary statistics.

作者信息

Stoneman Hayley R, Price Adelle M, Trout Nikole Scribner, Lamont Riley, Tifour Souha, Pozdeyev Nikita, Crooks Kristy, Lin Meng, Rafaels Nicholas, Gignoux Christopher R, Marker Katie M, Hendricks Audrey E

机构信息

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Human Medical Genetics and Genomics Program, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA.

出版信息

Am J Hum Genet. 2025 Feb 6;112(2):235-253. doi: 10.1016/j.ajhg.2024.12.007. Epub 2025 Jan 16.

DOI:10.1016/j.ajhg.2024.12.007
PMID:39824191
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11866976/
Abstract

Genetic summary data are broadly accessible and highly useful, including for risk prediction, causal inference, fine mapping, and incorporation of external controls. However, collapsing individual-level data into summary data, such as allele frequencies, masks intra- and inter-sample heterogeneity, leading to confounding, reduced power, and bias. Ultimately, unaccounted-for substructure limits summary data usability, especially for understudied or admixed populations. There is a need for methods to enable the harmonization of summary data where the underlying substructure is matched between datasets. Here, we present Summix2, a comprehensive set of methods and software based on a computationally efficient mixture model to enable the harmonization of genetic summary data by estimating and adjusting for substructure. In extensive simulations and application to public data, we show that Summix2 characterizes finer-scale population structure, identifies ascertainment bias, and scans for potential regions of selection due to local substructure deviation. Summix2 increases the robust use of diverse, publicly available summary data, resulting in improved and more equitable research.

摘要

遗传汇总数据广泛可用且非常有用,包括用于风险预测、因果推断、精细定位以及纳入外部对照。然而,将个体水平的数据汇总为汇总数据,如等位基因频率,会掩盖样本内和样本间的异质性,导致混杂、功效降低和偏差。最终,未考虑的亚结构限制了汇总数据的可用性,特别是对于研究不足或混合人群。需要一些方法来实现汇总数据的协调,使数据集之间的潜在亚结构相匹配。在这里,我们提出了Summix2,这是一套基于计算效率高的混合模型的综合方法和软件,通过估计和调整亚结构来实现遗传汇总数据的协调。在广泛的模拟和对公共数据的应用中,我们表明Summix2能够刻画更精细尺度的群体结构,识别确定偏差,并扫描由于局部亚结构偏差导致的潜在选择区域。Summix2增加了对多样的、公开可用的汇总数据的稳健使用,从而带来更好且更公平的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/21f3c276821f/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/5c834b6272c2/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/6d6132a5f3db/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/c5fe0c7b95e2/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/d8f170e8167e/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/576dc5c8dd1e/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/21f3c276821f/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/5c834b6272c2/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/6d6132a5f3db/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/c5fe0c7b95e2/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/d8f170e8167e/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/576dc5c8dd1e/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfc/11866976/21f3c276821f/gr5.jpg

相似文献

1
Characterizing substructure via mixture modeling in large-scale genetic summary statistics.通过混合模型在大规模遗传汇总统计中表征子结构。
Am J Hum Genet. 2025 Feb 6;112(2):235-253. doi: 10.1016/j.ajhg.2024.12.007. Epub 2025 Jan 16.
2
Characterizing substructure via mixture modeling in large-scale genetic summary statistics.通过大规模遗传汇总统计中的混合建模来表征子结构
bioRxiv. 2024 May 13:2024.01.29.577805. doi: 10.1101/2024.01.29.577805.
3
Genome-wide Association Identifies Novel Etiological Insights Associated with Parkinson's Disease in African and African Admixed Populations.全基因组关联研究揭示非洲及非裔混血人群中与帕金森病相关的新病因学见解。
medRxiv. 2023 May 7:2023.05.05.23289529. doi: 10.1101/2023.05.05.23289529.
4
Summix: A method for detecting and adjusting for population structure in genetic summary data.Summix:一种用于检测和调整遗传汇总数据中群体结构的方法。
Am J Hum Genet. 2021 Jul 1;108(7):1270-1282. doi: 10.1016/j.ajhg.2021.05.016. Epub 2021 Jun 21.
5
Properties of global- and local-ancestry adjustments in genetic association tests in admixed populations.混合人群基因关联测试中全局和局部祖先调整的特性
Genet Epidemiol. 2018 Mar;42(2):214-229. doi: 10.1002/gepi.22103. Epub 2017 Dec 30.
6
Fine-mapping in admixed populations using CARMA-X, with applications to Latin American studies.使用CARMA-X在混合人群中进行精细定位及其在拉丁美洲研究中的应用。
Am J Hum Genet. 2025 May 1;112(5):1215-1232. doi: 10.1016/j.ajhg.2025.02.020. Epub 2025 Mar 26.
7
Identification of genetic risk loci and causal insights associated with Parkinson's disease in African and African admixed populations: a genome-wide association study.在非洲和非洲混合人群中与帕金森病相关的遗传风险基因座和因果关系的鉴定:一项全基因组关联研究。
Lancet Neurol. 2023 Nov;22(11):1015-1025. doi: 10.1016/S1474-4422(23)00283-1. Epub 2023 Aug 23.
8
Estimating local ancestry in admixed populations.估计混合群体中的本地祖先。
Am J Hum Genet. 2008 Feb;82(2):290-303. doi: 10.1016/j.ajhg.2007.09.022.
9
Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a Breast Cancer Consortium.混合人群 GWAS 的增强统计检验:使用来自 CARe 和乳腺癌联盟的非裔美国人进行评估。
PLoS Genet. 2011 Apr;7(4):e1001371. doi: 10.1371/journal.pgen.1001371. Epub 2011 Apr 21.
10
Estimating SNP heritability in presence of population substructure in biobank-scale datasets.在生物库规模数据集存在群体亚结构的情况下估计 SNP 遗传力。
Genetics. 2022 Apr 4;220(4). doi: 10.1093/genetics/iyac015.

引用本文的文献

1
Global multi-ancestry genetic study elucidates genes and biological pathways associated with thyroid cancer and benign thyroid diseases.全球多血统基因研究阐明了与甲状腺癌和良性甲状腺疾病相关的基因及生物学途径。
medRxiv. 2025 May 16:2025.05.15.25327513. doi: 10.1101/2025.05.15.25327513.
2
CCAFE: Estimating Case and Control Allele Frequencies from GWAS Summary Statistics.CCAFE:从全基因组关联研究汇总统计数据中估计病例和对照等位基因频率
bioRxiv. 2024 Oct 29:2024.10.24.619530. doi: 10.1101/2024.10.24.619530.

本文引用的文献

1
ZMIX: estimating ancestry proportions using GWAS association Z-scores.ZMIX:使用全基因组关联研究(GWAS)关联Z分数估计祖先比例。
Bioinform Adv. 2024 Aug 29;4(1):vbae128. doi: 10.1093/bioadv/vbae128. eCollection 2024.
2
GAUSS: a summary-statistics-based R package for accurate estimation of linkage disequilibrium for variants, Gaussian imputation, and TWAS analysis of cosmopolitan cohorts.GAUSS:一个基于汇总统计的 R 包,用于准确估计变体的连锁不平衡、高斯插补以及世界性队列的 TWAS 分析。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae203.
3
Building a vertically integrated genomic learning health system: The biobank at the Colorado Center for Personalized Medicine.
建立一个垂直整合的基因组学学习健康系统:科罗拉多个性化医学中心的生物银行。
Am J Hum Genet. 2024 Jan 4;111(1):11-23. doi: 10.1016/j.ajhg.2023.12.001.
4
Unappreciated subcontinental admixture in Europeans and European Americans and implications for genetic epidemiology studies.欧洲人和欧洲裔美国人中被低估的次大陆混合以及对遗传流行病学研究的影响。
Nat Commun. 2023 Nov 7;14(1):6802. doi: 10.1038/s41467-023-42491-0.
5
Mendelian randomization.孟德尔随机化
Nat Rev Methods Primers. 2022 Feb 10;2. doi: 10.1038/s43586-021-00092-5.
6
Including multiracial individuals is crucial for race, ethnicity and ancestry frameworks in genetics and genomics.在遗传学和基因组学中,纳入多种族个体对于种族、民族和祖先框架至关重要。
Nat Genet. 2023 Jun;55(6):895-900. doi: 10.1038/s41588-023-01394-y.
7
Use of race, ethnicity, and ancestry data in health research.种族、族裔和祖籍数据在健康研究中的应用。
PLOS Glob Public Health. 2022 Sep 15;2(9):e0001060. doi: 10.1371/journal.pgph.0001060. eCollection 2022.
8
The Gene Ontology knowledgebase in 2023.2023 版基因本体论知识库。
Genetics. 2023 May 4;224(1). doi: 10.1093/genetics/iyad031.
9
Natural selection of immune and metabolic genes associated with health in two lowland Bolivian populations.与玻利维亚两个低地人群健康相关的免疫和代谢基因的自然选择。
Proc Natl Acad Sci U S A. 2023 Jan 3;120(1):e2207544120. doi: 10.1073/pnas.2207544120. Epub 2022 Dec 27.
10
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics.利用英国生物库作为全球人群的全球参考:从 GWAS 汇总统计数据衡量祖先多样性的应用。
Bioinformatics. 2022 Jun 27;38(13):3477-3480. doi: 10.1093/bioinformatics/btac348.