• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多样的祖先代表性可改善基因不耐受指标。

Diverse ancestral representation improves genetic intolerance metrics.

作者信息

Han Alexander L, Sands Chloe F, Matelska Dorota, Butts Jessica C, Ravanmehr Vida, Hu Fengyuan, Villavicencio Gonzalez Esmeralda, Katsanis Nicholas, Bustamante Carlos D, Wang Quanli, Petrovski Slavé, Vitsios Dimitrios, Dhindsa Ryan S

机构信息

Department of Pathology and Immunology, Baylor College of Medicine, Houston, TX, USA.

Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, USA.

出版信息

Nat Commun. 2025 Mar 18;16(1):2648. doi: 10.1038/s41467-025-57885-5.

DOI:10.1038/s41467-025-57885-5
PMID:40102419
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11920395/
Abstract

The unprecedented scale of genomic databases has revolutionized our ability to identify regions in the human genome intolerant to variation-regions often implicated in disease. However, these datasets remain constrained by limited ancestral diversity. Here, we analyze whole-exome sequencing data from 460,551 UK Biobank and 125,748 Genome Aggregation Database (gnomAD) participants across multiple ancestries to test several key intolerance metrics, including the Residual Variance Intolerance Score (RVIS), Missense Tolerance Ratio (MTR), and Loss-of-Function Observed/Expected ratio (LOF O/E). We demonstrate that increasing ancestral representation, rather than sample size alone, critically drives their performance. Scores trained on variation observed in African and Admixed American ancestral groups show higher resolution in detecting haploinsufficient and neurodevelopmental disease risk genes compared to scores trained on European ancestry groups. Most strikingly, MTR trained on 43,000 multi-ancestry exomes demonstrates greater predictive power than when trained on a nearly 10-fold larger dataset of 440,000 non-Finnish European exomes. We further find that European ancestry group-based scores are likely approaching saturation. These findings highlight the need for enhanced population representation in genomic resources to fully realize the potential of precision medicine and drug discovery. Ancestry group-specific scores are publicly available through an interactive portal: http://intolerance.public.cgr.astrazeneca.com/ .

摘要

基因组数据库前所未有的规模彻底改变了我们识别人类基因组中不耐受变异区域的能力,这些区域常常与疾病相关。然而,这些数据集仍然受到祖先多样性有限的限制。在这里,我们分析了来自英国生物银行的460,551名参与者以及基因组聚合数据库(gnomAD)的125,748名参与者的全外显子组测序数据,这些参与者来自多个祖先群体,以测试几个关键的不耐受指标,包括残余方差不耐受分数(RVIS)、错义耐受率(MTR)和功能丧失观察/预期比率(LOF O/E)。我们证明,增加祖先代表性,而不仅仅是样本量,对这些指标的性能起着关键作用。与基于欧洲祖先群体训练的分数相比,基于非洲和混血美国祖先群体中观察到的变异训练的分数在检测单倍剂量不足和神经发育疾病风险基因方面具有更高的分辨率。最引人注目的是,基于43,000个多祖先外显子组训练的MTR比基于近10倍大的440,000个非芬兰欧洲外显子组数据集训练时具有更大的预测能力。我们进一步发现,基于欧洲祖先群体的分数可能已接近饱和。这些发现凸显了在基因组资源中增加人群代表性的必要性,以便充分实现精准医学和药物发现的潜力。特定祖先群体的分数可通过一个交互式门户公开获取:http://intolerance.public.cgr.astrazeneca.com/ 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/ba6294f3c2c0/41467_2025_57885_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/f82ec5354959/41467_2025_57885_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/d5d665f563b1/41467_2025_57885_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/16ba92e18ee0/41467_2025_57885_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/ba6294f3c2c0/41467_2025_57885_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/f82ec5354959/41467_2025_57885_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/d5d665f563b1/41467_2025_57885_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/16ba92e18ee0/41467_2025_57885_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137b/11920395/ba6294f3c2c0/41467_2025_57885_Fig4_HTML.jpg

相似文献

1
Diverse ancestral representation improves genetic intolerance metrics.多样的祖先代表性可改善基因不耐受指标。
Nat Commun. 2025 Mar 18;16(1):2648. doi: 10.1038/s41467-025-57885-5.
2
Rare variant contribution to human disease in 281,104 UK Biobank exomes.281104 名英国生物银行外显子组中罕见变异对人类疾病的贡献。
Nature. 2021 Sep;597(7877):527-532. doi: 10.1038/s41586-021-03855-y. Epub 2021 Aug 10.
3
Leveraging diverse genomic data to guide equitable carrier screening: Insights from gnomAD v.4.1.0.利用多样的基因组数据指导公平的携带者筛查:来自gnomAD v.4.1.0的见解。
Am J Hum Genet. 2025 Jan 2;112(1):181-195. doi: 10.1016/j.ajhg.2024.11.004. Epub 2024 Nov 29.
4
MTR-Viewer: identifying regions within genes under purifying selection.MTR-Viewer:鉴定基因中处于纯化选择下的区域。
Nucleic Acids Res. 2019 Jul 2;47(W1):W121-W126. doi: 10.1093/nar/gkz457.
5
Sub-genic intolerance, ClinVar, and the epilepsies: A whole-exome sequencing study of 29,165 individuals.亚基因不耐受、ClinVar 与癫痫:29165 例个体的全外显子组测序研究。
Am J Hum Genet. 2021 Jun 3;108(6):965-982. doi: 10.1016/j.ajhg.2021.04.009. Epub 2021 Apr 30.
6
Genetic intolerance analysis as a tool for protein science.遗传不耐受分析作为蛋白质科学的工具。
Biochim Biophys Acta Biomembr. 2020 Jan 1;1862(1):183058. doi: 10.1016/j.bbamem.2019.183058. Epub 2019 Sep 5.
7
Exploring the mutational landscape of genes associated with inherited retinal disease using large genomic datasets: identifying loss of function intolerance and outlying propensities for missense changes.利用大型基因组数据集探索与遗传性视网膜疾病相关基因的突变景观:鉴定功能丧失不耐受和错义突变的突出倾向。
BMJ Open Ophthalmol. 2022 Aug;7(1). doi: 10.1136/bmjophth-2022-001079. Epub 2022 Aug 25.
8
The mutational constraint spectrum quantified from variation in 141,456 humans.从 141456 名人类个体的变异中量化的突变约束谱。
Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. Epub 2020 May 27.
9
Unequal representation of genetic variation across ancestry groups creates healthcare inequality in the application of precision medicine.遗传变异在不同祖先群体中的分布不均,导致了精准医学应用中的医疗保健不平等。
Genome Biol. 2016 Jul 14;17(1):157. doi: 10.1186/s13059-016-1016-y.
10
Cross-ancestry genome-wide association studies identified heterogeneous loci associated with differences of allele frequency and regulome tagging between participants of European descent and other ancestry groups from the UK Biobank.跨种族全基因组关联研究确定了与英国生物库中欧洲血统和其他血统群体参与者之间等位基因频率差异和调控标记相关的异质位点。
Hum Mol Genet. 2021 Jul 9;30(15):1457-1467. doi: 10.1093/hmg/ddab114.

本文引用的文献

1
Genome-wide prediction of dominant and recessive neurodevelopmental disorder-associated genes.全基因组对显性和隐性神经发育障碍相关基因的预测。
Am J Hum Genet. 2025 Mar 6;112(3):693-708. doi: 10.1016/j.ajhg.2025.02.001. Epub 2025 Feb 26.
2
Polygenic risk score portability for common diseases across genetically diverse populations.多基因风险评分在遗传多样性人群中常见疾病的可转移性。
Hum Genomics. 2024 Sep 2;18(1):93. doi: 10.1186/s40246-024-00664-y.
3
Using Sex-Specific Polygenic Risk to Prognosticate Coronary Artery Disease in Women.
利用性别特异性多基因风险预测女性冠状动脉疾病
J Am Heart Assoc. 2024 Jun 18;13(12):e034946. doi: 10.1161/JAHA.123.034946. Epub 2024 Jun 14.
4
Refining the impact of genetic evidence on clinical success.细化基因证据对临床成功的影响。
Nature. 2024 May;629(8012):624-629. doi: 10.1038/s41586-024-07316-0. Epub 2024 Apr 17.
5
Mouse Genome Informatics: an integrated knowledgebase system for the laboratory mouse.鼠类基因组信息学:用于实验鼠的综合知识库系统。
Genetics. 2024 May 7;227(1). doi: 10.1093/genetics/iyae031.
6
A genomic mutational constraint map using variation in 76,156 human genomes.基于 76156 个人类基因组的变异,绘制出基因组突变约束图谱。
Nature. 2024 Jan;625(7993):92-100. doi: 10.1038/s41586-023-06045-0. Epub 2023 Dec 6.
7
Rare variant associations with plasma protein levels in the UK Biobank.英国生物库中血浆蛋白水平的罕见变异关联。
Nature. 2023 Oct;622(7982):339-347. doi: 10.1038/s41586-023-06547-x. Epub 2023 Oct 4.
8
Neurodevelopmental deficits and cell-type-specific transcriptomic perturbations in a mouse model of HNRNPU haploinsufficiency.HNRNPU 杂合不足小鼠模型的神经发育缺陷和细胞类型特异性转录组干扰。
PLoS Genet. 2023 Oct 2;19(10):e1010952. doi: 10.1371/journal.pgen.1010952. eCollection 2023 Oct.
9
Cancer-driving mutations are enriched in genic regions intolerant to germline variation.致癌突变在基因区域中富集,这些区域对种系变异不敏感。
Sci Adv. 2022 Aug 26;8(34):eabo6371. doi: 10.1126/sciadv.abo6371.
10
The sequences of 150,119 genomes in the UK Biobank.英国生物库中 150119 个基因组的序列。
Nature. 2022 Jul;607(7920):732-740. doi: 10.1038/s41586-022-04965-x. Epub 2022 Jul 20.