• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模人类基因组学的深度综合模型。

Deep integrative models for large-scale human genomics.

机构信息

Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark.

The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

出版信息

Nucleic Acids Res. 2023 Jul 7;51(12):e67. doi: 10.1093/nar/gkad373.

DOI:10.1093/nar/gkad373
PMID:37224538
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10325897/
Abstract

Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR.

摘要

多基因风险评分 (PRSs) 有望在精准医学中发挥关键作用。目前,PRS 预测因子通常基于使用汇总统计数据的线性模型,以及最近的个体水平数据。然而,这些预测因子主要捕捉加性关系,并且在它们可以使用的数据模式方面受到限制。我们开发了一种用于 PRS 预测的深度学习框架 (EIR),该框架包括一个专门为大规模基因组学数据设计的模型,即基因组局部网络 (GLN)。该框架支持多任务学习、自动整合其他临床和生化数据以及模型可解释性。当应用于来自英国生物库的个体水平数据时,GLN 模型与已建立的神经网络架构相比表现出有竞争力的性能,特别是对于某些特征,展示了其在建模复杂遗传关系方面的潜力。此外,GLN 模型在 1 型糖尿病方面优于线性 PRS 方法,这可能是由于对非加性遗传效应和上位性的建模。我们在 1 型糖尿病的背景下广泛识别非加性遗传效应和上位性,这为我们提供了支持。最后,我们构建了整合基因型、血液、尿液和人体测量数据的 PRS 模型,发现这提高了 93%所考虑的 290 种疾病和障碍的性能。EIR 可在 https://github.com/arnor-sigurdsson/EIR 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/0b89ce248146/gkad373fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/c19de417ece4/gkad373figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/d38401c38c0a/gkad373fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/a128676e03dd/gkad373fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/2dfaac0c4b6e/gkad373fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/117585ad1ba5/gkad373fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/0b89ce248146/gkad373fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/c19de417ece4/gkad373figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/d38401c38c0a/gkad373fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/a128676e03dd/gkad373fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/2dfaac0c4b6e/gkad373fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/117585ad1ba5/gkad373fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161c/10325897/0b89ce248146/gkad373fig5.jpg

相似文献

1
Deep integrative models for large-scale human genomics.大规模人类基因组学的深度综合模型。
Nucleic Acids Res. 2023 Jul 7;51(12):e67. doi: 10.1093/nar/gkad373.
2
Assessing polygenic risk score models for applications in populations with under-represented genomics data: an example of Vietnam.评估在基因组学数据代表性不足的人群中应用的多基因风险评分模型:以越南为例。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac459.
3
Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks.癌症 PRSweb:一个具有主要癌症特征多基因风险评分的在线知识库及其在两个独立生物库中的评估。
Am J Hum Genet. 2020 Nov 5;107(5):815-836. doi: 10.1016/j.ajhg.2020.08.025. Epub 2020 Sep 28.
4
netCRS: Network-based comorbidity risk score for prediction of myocardial infarction using biobank-scaled PheWAS data.基于网络的共病风险评分,利用生物库规模的 phewas 数据预测心肌梗死。
Pac Symp Biocomput. 2022;27:325-336.
5
Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction.利用个体水平的遗传数据和 GWAS 汇总统计数据可以提高多基因预测。
Am J Hum Genet. 2021 Jun 3;108(6):1001-1011. doi: 10.1016/j.ajhg.2021.04.014. Epub 2021 May 7.
6
Efficient Implementation of Penalized Regression for Genetic Risk Prediction.高效实现基于惩罚回归的遗传风险预测。
Genetics. 2019 May;212(1):65-74. doi: 10.1534/genetics.119.302019. Epub 2019 Feb 26.
7
Exploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb.探索密歇根基因组倡议和英国生物库表型中用于皮肤癌的多种多基因风险评分:PRSWeb。
PLoS Genet. 2019 Jun 13;15(6):e1008202. doi: 10.1371/journal.pgen.1008202. eCollection 2019 Jun.
8
Novel strategy for disease risk prediction incorporating predicted gene expression and DNA methylation data: a multi-phased study of prostate cancer.纳入预测基因表达和 DNA 甲基化数据的疾病风险预测新策略:前列腺癌的多阶段研究。
Cancer Commun (Lond). 2021 Dec;41(12):1387-1397. doi: 10.1002/cac2.12205. Epub 2021 Sep 14.
9
Evaluation of polygenic risk scores to differentiate between type 1 and type 2 diabetes.评估多基因风险评分以区分 1 型和 2 型糖尿病。
Genet Epidemiol. 2023 Jun;47(4):303-313. doi: 10.1002/gepi.22521. Epub 2023 Feb 23.
10
Incorporating functional annotation with bilevel continuous shrinkage for polygenic risk prediction.将功能注释与双层连续收缩相结合进行多基因风险预测。
BMC Bioinformatics. 2024 Feb 9;25(1):65. doi: 10.1186/s12859-024-05664-2.

引用本文的文献

1
Variational autoencoder-based model improves polygenic prediction in blood cell traits.基于变分自编码器的模型改进了血细胞性状的多基因预测。
HGG Adv. 2025 Aug 8;6(4):100490. doi: 10.1016/j.xhgg.2025.100490.
2
Performance of deep-learning-based approaches to improve polygenic scores.基于深度学习的方法在提高多基因评分方面的表现。
Nat Commun. 2025 Jun 2;16(1):5122. doi: 10.1038/s41467-025-60056-1.
3
Deep learning-based polygenic scores enhance generalizability of psychiatric disorders prediction.基于深度学习的多基因评分提高了精神疾病预测的泛化能力。

本文引用的文献

1
Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark.异构网络表示学习:一个包含综述与基准测试的统一框架
IEEE Trans Knowl Data Eng. 2022 Oct;34(10):4854-4873. doi: 10.1109/tkde.2020.3045924. Epub 2020 Dec 21.
2
Multi-PGS enhances polygenic prediction by combining 937 polygenic scores.多基因评分聚合(Multi-PGS)通过整合 937 个多基因评分来增强多基因预测。
Nat Commun. 2023 Aug 5;14(1):4702. doi: 10.1038/s41467-023-40330-w.
3
Identifying and correcting for misspecifications in GWAS summary statistics and polygenic scores.
medRxiv. 2025 May 5:2025.05.05.25326794. doi: 10.1101/2025.05.05.25326794.
4
Variational Autoencoder-based Model Improves Polygenic Prediction in Blood Cell Traits.基于变分自编码器的模型改善了血细胞性状的多基因预测。
bioRxiv. 2025 Jan 18:2025.01.13.632820. doi: 10.1101/2025.01.13.632820.
5
Modeling gene interactions in polygenic prediction via geometric deep learning.通过几何深度学习对多基因预测中的基因相互作用进行建模。
Genome Res. 2025 Jan 22;35(1):178-187. doi: 10.1101/gr.279694.124.
6
Assessing polyomic risk to predict Alzheimer's disease using a machine learning model.使用机器学习模型评估多组学风险以预测阿尔茨海默病。
Alzheimers Dement. 2024 Dec;20(12):8700-8714. doi: 10.1002/alz.14319. Epub 2024 Nov 7.
7
Trait imputation enhances nonlinear genetic prediction for some traits.性状推断提高了某些性状的非线性遗传预测能力。
Genetics. 2024 Nov 6;228(3). doi: 10.1093/genetics/iyae148.
8
Phenotype prediction using biologically interpretable neural networks on multi-cohort multi-omics data.基于多队列多组学生物学数据的可解释神经网络进行表型预测。
NPJ Syst Biol Appl. 2024 Aug 2;10(1):81. doi: 10.1038/s41540-024-00405-w.
9
An assessment of the value of deep neural networks in genetic risk prediction for surgically relevant outcomes.评估深度神经网络在手术相关结局的遗传风险预测中的价值。
PLoS One. 2024 Jul 15;19(7):e0294368. doi: 10.1371/journal.pone.0294368. eCollection 2024.
10
A roadmap for multi-omics data integration using deep learning.利用深度学习进行多组学数据整合的路线图。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab454.
识别并校正全基因组关联研究汇总统计数据和多基因评分中的错误设定。
HGG Adv. 2022 Aug 18;3(4):100136. doi: 10.1016/j.xhgg.2022.100136. eCollection 2022 Oct 13.
4
Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations.纳入 SNPs 和 PRS 的非线性机器学习模型可改善不同人群的多基因预测。
Commun Biol. 2022 Aug 22;5(1):856. doi: 10.1038/s42003-022-03812-z.
5
Open problems in human trait genetics.人类特质遗传学中的开放性问题。
Genome Biol. 2022 Jun 20;23(1):131. doi: 10.1186/s13059-022-02697-9.
6
Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data.评估全基因组序列数据中稀有变异对复杂性状遗传度的贡献。
Nat Genet. 2022 Mar;54(3):263-273. doi: 10.1038/s41588-021-00997-7. Epub 2022 Mar 7.
7
Machine learning optimized polygenic scores for blood cell traits identify sex-specific trajectories and genetic correlations with disease.机器学习优化的血细胞性状多基因评分可识别性别特异性轨迹以及与疾病的遗传相关性。
Cell Genom. 2022 Jan 12;2(1):None. doi: 10.1016/j.xgen.2021.100086.
8
Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort.245 个多基因评分在英国生物样本库中得出并应用于来自同一队列的 9 个祖先群体时的可转移性。
Am J Hum Genet. 2022 Jan 6;109(1):12-23. doi: 10.1016/j.ajhg.2021.11.008.
9
Phantom epistasis between unlinked loci.非连锁基因座间的假上位效应。
Nature. 2021 Aug;596(7871):E1-E3. doi: 10.1038/s41586-021-03765-z. Epub 2021 Aug 11.
10
Fast numerical optimization for genome sequencing data in population biobanks.群体生物库中基因组测序数据的快速数值优化。
Bioinformatics. 2021 Nov 18;37(22):4148-4155. doi: 10.1093/bioinformatics/btab452.