• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多组学和生物标志物的疾病预测使英国生物库中的病例对照遗传发现成为可能。

Disease prediction with multi-omics and biomarkers empowers case-control genetic discoveries in the UK Biobank.

机构信息

Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.

Department of Haematology, University of Cambridge, Cambridge, UK.

出版信息

Nat Genet. 2024 Sep;56(9):1821-1831. doi: 10.1038/s41588-024-01898-1. Epub 2024 Sep 11.

DOI:10.1038/s41588-024-01898-1
PMID:39261665
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11390475/
Abstract

The emergence of biobank-level datasets offers new opportunities to discover novel biomarkers and develop predictive algorithms for human disease. Here, we present an ensemble machine-learning framework (machine learning with phenotype associations, MILTON) utilizing a range of biomarkers to predict 3,213 diseases in the UK Biobank. Leveraging the UK Biobank's longitudinal health record data, MILTON predicts incident disease cases undiagnosed at time of recruitment, largely outperforming available polygenic risk scores. We further demonstrate the utility of MILTON in augmenting genetic association analyses in a phenome-wide association study of 484,230 genome-sequenced samples, along with 46,327 samples with matched plasma proteomics data. This resulted in improved signals for 88 known (P < 1 × 10) gene-disease relationships alongside 182 gene-disease relationships that did not achieve genome-wide significance in the nonaugmented baseline cohorts. We validated these discoveries in the FinnGen biobank alongside two orthogonal machine-learning methods built for gene-disease prioritization. All extracted gene-disease associations and incident disease predictive biomarkers are publicly available ( http://milton.public.cgr.astrazeneca.com ).

摘要

生物库级数据集的出现为发现新的生物标志物和开发人类疾病预测算法提供了新的机会。在这里,我们提出了一个集成机器学习框架(基于表型关联的机器学习,MILTON),利用一系列生物标志物来预测英国生物库中的 3213 种疾病。利用英国生物库的纵向健康记录数据,MILTON 预测了招募时未诊断出的疾病病例,其表现大大优于现有的多基因风险评分。我们进一步证明了 MILTON 在增强全基因组关联研究中的效用,该研究对 484230 个全基因组测序样本和 46327 个具有匹配血浆蛋白质组学数据的样本进行了表型关联研究。这导致了 88 个已知(P<1×10)基因-疾病关系的信号得到改善,同时还有 182 个基因-疾病关系在未增强的基线队列中没有达到全基因组显著水平。我们在 FinnGen 生物库中对这些发现进行了验证,并与为基因-疾病优先级排序而构建的两种正交机器学习方法进行了比较。所有提取的基因-疾病关联和预测疾病的生物标志物都可以公开获得(http://milton.public.cgr.astrazeneca.com)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/f2d4b8e3bceb/41588_2024_1898_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/f665e32ec63d/41588_2024_1898_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/50135f3bb0cd/41588_2024_1898_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/85a0f73fafb7/41588_2024_1898_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/ccd0f5169dd2/41588_2024_1898_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/371ec3b70069/41588_2024_1898_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/f2d4b8e3bceb/41588_2024_1898_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/f665e32ec63d/41588_2024_1898_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/50135f3bb0cd/41588_2024_1898_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/85a0f73fafb7/41588_2024_1898_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/ccd0f5169dd2/41588_2024_1898_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/371ec3b70069/41588_2024_1898_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4399/11390475/f2d4b8e3bceb/41588_2024_1898_Fig6_HTML.jpg

相似文献

1
Disease prediction with multi-omics and biomarkers empowers case-control genetic discoveries in the UK Biobank.多组学和生物标志物的疾病预测使英国生物库中的病例对照遗传发现成为可能。
Nat Genet. 2024 Sep;56(9):1821-1831. doi: 10.1038/s41588-024-01898-1. Epub 2024 Sep 11.
2
Improving prediction models of amyotrophic lateral sclerosis (ALS) using polygenic, pre-existing conditions, and survey-based risk scores in the UK Biobank.利用英国生物库中的多基因、已存在疾病和基于调查的风险评分来改进肌萎缩侧索硬化症(ALS)的预测模型。
J Neurol. 2024 Oct;271(10):6923-6934. doi: 10.1007/s00415-024-12644-2. Epub 2024 Sep 9.
3
A phenome-wide association study of polygenic scores for selected childhood cancer: Results from the UK Biobank.一项针对特定儿童癌症多基因评分的全表型组关联研究:来自英国生物银行的结果。
HGG Adv. 2025 Jan 9;6(1):100356. doi: 10.1016/j.xhgg.2024.100356. Epub 2024 Sep 26.
4
Optimizing UK biobank cloud-based research analysis platform to fine-map coronary artery disease loci in whole genome sequencing data.优化英国生物银行基于云的研究分析平台,以在全基因组测序数据中精细定位冠状动脉疾病基因座。
Sci Rep. 2025 Mar 25;15(1):10335. doi: 10.1038/s41598-025-95286-2.
5
Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan.对 67.6 万人进行的跨生物库分析阐明了复杂性状的多基因风险评分与人类寿命之间的关联。
Nat Med. 2020 Apr;26(4):542-548. doi: 10.1038/s41591-020-0785-8. Epub 2020 Mar 23.
6
Genetic and Phenotypic Features of Schizophrenia in the UK Biobank.英国生物银行中精神分裂症的遗传和表型特征。
JAMA Psychiatry. 2024 Jul 1;81(7):681-690. doi: 10.1001/jamapsychiatry.2024.0200.
7
Exploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb.探索密歇根基因组倡议和英国生物库表型中用于皮肤癌的多种多基因风险评分:PRSWeb。
PLoS Genet. 2019 Jun 13;15(6):e1008202. doi: 10.1371/journal.pgen.1008202. eCollection 2019 Jun.
8
Analysis of common genetic variation and rare CNVs in the Australian Autism Biobank.澳大利亚自闭症生物样本库中常见遗传变异与罕见 CNVs 的分析。
Mol Autism. 2021 Feb 10;12(1):12. doi: 10.1186/s13229-020-00407-5.
9
Development of a Polygenic Risk Score for Metabolic Dysfunction-Associated Steatotic Liver Disease Prediction in UK Biobank.用于在英国生物银行中预测代谢功能障碍相关脂肪性肝病的多基因风险评分的开发
Genes (Basel). 2024 Dec 28;16(1):33. doi: 10.3390/genes16010033.
10
Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning.五项生物库中多基因评分方法的评估显示,生物库之间的差异大于方法之间的差异,并发现了集成学习的益处。
Am J Hum Genet. 2024 Jul 11;111(7):1431-1447. doi: 10.1016/j.ajhg.2024.06.003. Epub 2024 Jun 21.

引用本文的文献

1
Learning the natural history of human disease with generative transformers.利用生成式变压器了解人类疾病的自然史。
Nature. 2025 Sep 17. doi: 10.1038/s41586-025-09529-3.
2
Creating an Interactive Web Interface for Networks Stored in Knowledge Graph Databases.为存储在知识图谱数据库中的网络创建交互式网络界面。
Curr Protoc. 2025 Sep;5(9):e70200. doi: 10.1002/cpz1.70200.
3
Applications and challenges of biomarker-based predictive models in proactive health management.基于生物标志物的预测模型在主动健康管理中的应用与挑战

本文引用的文献

1
Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data.利用知识图谱、图神经网络和英国生物库数据进行表型全基因组治疗靶点鉴定。
Sci Adv. 2024 May 10;10(19):eadj1424. doi: 10.1126/sciadv.adj1424. Epub 2024 May 8.
2
Plasma proteomic profiles predict future dementia in healthy adults.血浆蛋白质组谱可预测健康成年人的未来痴呆症。
Nat Aging. 2024 Feb;4(2):247-260. doi: 10.1038/s43587-023-00565-0. Epub 2024 Feb 12.
3
Neurofilament light protein as a biomarker for spinal muscular atrophy: a review and reference ranges.
Front Public Health. 2025 Aug 18;13:1633487. doi: 10.3389/fpubh.2025.1633487. eCollection 2025.
4
DeepAnnotation: A novel interpretable deep learning-based genomic selection model that integrates comprehensive functional annotations.深度注释:一种基于深度学习的新型可解释基因组选择模型,该模型整合了全面的功能注释。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf083.
5
Unlocking mental health insights with UK Biobank data: Past use and future opportunities.利用英国生物银行数据解锁心理健康见解:过去的使用情况与未来机遇
Psychol Med. 2025 Aug 26;55:e244. doi: 10.1017/S0033291725101359.
6
Leveraging multiomic approaches to elucidate mechanisms of heterogeneity in Alzheimer's disease: Neuropsychiatric symptoms, co-pathologies, and sex differences.利用多组学方法阐明阿尔茨海默病的异质性机制:神经精神症状、共病及性别差异。
Alzheimers Dement. 2025 Aug;21(8):e70549. doi: 10.1002/alz.70549.
7
Multi-omics integration predicts 17 disease incidences in the UK Biobank.多组学整合预测英国生物银行中的17种疾病发病率。
medRxiv. 2025 Aug 5:2025.08.01.25332841. doi: 10.1101/2025.08.01.25332841.
8
The $10 proteome: low-cost, deep and quantitative proteome profiling of limited sample amounts using the Orbitrap Astral and timsTOF Ultra 2 mass spectrometers.10美元蛋白质组:使用Orbitrap Astral和timsTOF Ultra 2质谱仪对有限样本量进行低成本、深度和定量蛋白质组分析
bioRxiv. 2025 Jul 31:2025.07.29.667408. doi: 10.1101/2025.07.29.667408.
9
Large-scale evaluation of proteomic and polygenic risk scores reveals complementary contributions to incident disease prediction.蛋白质组学和多基因风险评分的大规模评估揭示了对疾病发病预测的互补作用。
medRxiv. 2025 Jul 11:2025.07.10.25331242. doi: 10.1101/2025.07.10.25331242.
10
Potential value streams of an integrated Canadian serosurveillance network.加拿大综合血清学监测网络的潜在价值流。
Can J Public Health. 2025 Jun 30. doi: 10.17269/s41997-025-01075-9.
神经丝轻蛋白作为脊髓性肌萎缩症的生物标志物:综述和参考范围。
Clin Chem Lab Med. 2024 Jan 15;62(7):1252-1265. doi: 10.1515/cclm-2023-1311. Print 2024 Jun 25.
4
Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries.基于深度学习的人群规模生物库数据表型推断可增加遗传发现。
Nat Genet. 2023 Dec;55(12):2269-2276. doi: 10.1038/s41588-023-01558-w. Epub 2023 Nov 20.
5
Plasma proteomic associations with genetics and health in the UK Biobank.英国生物库中血浆蛋白质组与遗传学和健康的关联。
Nature. 2023 Oct;622(7982):329-338. doi: 10.1038/s41586-023-06592-6. Epub 2023 Oct 4.
6
Rare variant associations with plasma protein levels in the UK Biobank.英国生物库中血浆蛋白水平的罕见变异关联。
Nature. 2023 Oct;622(7982):339-347. doi: 10.1038/s41586-023-06547-x. Epub 2023 Oct 4.
7
Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes.对英国生物银行394,841个外显子组中的数千种表型进行系统性单变异和基于基因的关联测试。
Cell Genom. 2022 Aug 15;2(9):100168. doi: 10.1016/j.xgen.2022.100168. eCollection 2022 Sep 14.
8
FinnGen provides genetic insights from a well-phenotyped isolated population.FinnGen 为一个表型良好的隔离人群提供了遗传学方面的见解。
Nature. 2023 Jan;613(7944):508-518. doi: 10.1038/s41586-022-05473-8. Epub 2023 Jan 18.
9
Cancer-driving mutations are enriched in genic regions intolerant to germline variation.致癌突变在基因区域中富集,这些区域对种系变异不敏感。
Sci Adv. 2022 Aug 26;8(34):eabo6371. doi: 10.1126/sciadv.abo6371.
10
Breast Cancer; Discovery of Novel Diagnostic Biomarkers, Drug Resistance, and Therapeutic Implications.乳腺癌;新型诊断生物标志物的发现、耐药性及治疗意义
Front Mol Biosci. 2022 Feb 21;9:783450. doi: 10.3389/fmolb.2022.783450. eCollection 2022.