Suppr超能文献

基于变分自编码器的模型改进了血细胞性状的多基因预测。

Variational autoencoder-based model improves polygenic prediction in blood cell traits.

作者信息

Li Xiaoqi, Kharitonova Elena, Pang Minxing, Wen Jia, Zhou Laura Y, Raffield Laura, Zhou Haibo, Yao Huaxiu, Chen Can, Li Yun, Sun Quan

机构信息

Carolina Health Informatics Program, University of North Carolina, Chapel Hill, NC, USA.

Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA.

出版信息

HGG Adv. 2025 Aug 8;6(4):100490. doi: 10.1016/j.xhgg.2025.100490.

Abstract

Genetic prediction of complex traits, enabled by large-scale genomic studies, has created new measures to understand individual genetic predisposition. Polygenic risk scores (PRSs) offer a way to aggregate information across the genome, enabling personalized risk prediction for complex traits and diseases. However, conventional PRS calculation methods that rely on linear models are limited in their ability to capture complex patterns and interaction effects in high-dimensional genomic data. In this study, we seek to improve the predictive power of PRS through applying advanced deep learning techniques. We show that the variational autoencoder-based model for PRS construction (VAE-PRS) outperforms currently state-of-the-art methods for biobank-level data in 14 out of 16 blood cell traits, while being computationally efficient. Through comprehensive experiments, we found that the VAE-PRS model offers the ability to capture interaction effects in high-dimensional data and shows robust performance across different pre-screened variant sets. Furthermore, VAE-PRS is easily interpretable via assessing the contribution of each individual marker to the final prediction score through the Shapley additive explanations method, providing potential new insights in identifying trait-associated genetic variants. In summary, VAE-PRS presents a measure to genetic risk prediction for blood cell traits by harnessing the power of deep learning methods given appropriate training sample size, which could further facilitate the development of personalized medicine and genetic research.

摘要

大规模基因组研究实现的复杂性状的遗传预测,创造了理解个体遗传易感性的新方法。多基因风险评分(PRSs)提供了一种整合全基因组信息的方式,能够对复杂性状和疾病进行个性化风险预测。然而,依赖线性模型的传统PRS计算方法在捕捉高维基因组数据中的复杂模式和相互作用效应方面能力有限。在本研究中,我们试图通过应用先进的深度学习技术来提高PRS的预测能力。我们表明,基于变分自编码器的PRS构建模型(VAE-PRS)在16种血细胞性状中的14种上,优于目前用于生物样本库级数据的最先进方法,同时计算效率高。通过全面的实验,我们发现VAE-PRS模型能够捕捉高维数据中的相互作用效应,并且在不同的预筛选变异集上表现出稳健的性能。此外,通过Shapley加性解释方法评估每个个体标记对最终预测分数的贡献,VAE-PRS很容易解释,为识别性状相关的遗传变异提供了潜在的新见解。总之,在有适当训练样本量的情况下,VAE-PRS通过利用深度学习方法的力量,为血细胞性状的遗传风险预测提供了一种方法,这可能进一步促进个性化医学和遗传研究的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02f2/12398231/2a1a520773ac/gr1.jpg

相似文献

1
Variational autoencoder-based model improves polygenic prediction in blood cell traits.
HGG Adv. 2025 Aug 8;6(4):100490. doi: 10.1016/j.xhgg.2025.100490.
2
Variational Autoencoder-based Model Improves Polygenic Prediction in Blood Cell Traits.
bioRxiv. 2025 Jan 18:2025.01.13.632820. doi: 10.1101/2025.01.13.632820.
4
Generalizability of polygenic prediction models: how is the R defined on test data?
BMC Med Genomics. 2024 May 16;17(1):132. doi: 10.1186/s12920-024-01905-8.
7
Robust pleiotropy-decomposed polygenic scores identify distinct contributions to elevated coronary artery disease polygenic risk.
PLoS Comput Biol. 2025 Jun 26;21(6):e1013191. doi: 10.1371/journal.pcbi.1013191. eCollection 2025 Jun.
8
A survey on deep learning for polygenic risk scores.
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf373.
10
Multimodal medical image-to-image translation via variational autoencoder latent space mapping.
Med Phys. 2025 Jul;52(7):e17912. doi: 10.1002/mp.17912. Epub 2025 May 29.

本文引用的文献

1
Principles and methods for transferring polygenic risk scores across global populations.
Nat Rev Genet. 2024 Jan;25(1):8-25. doi: 10.1038/s41576-023-00637-2. Epub 2023 Aug 24.
2
Deep integrative models for large-scale human genomics.
Nucleic Acids Res. 2023 Jul 7;51(12):e67. doi: 10.1093/nar/gkad373.
3
Participation bias in the UK Biobank distorts genetic associations and downstream analyses.
Nat Hum Behav. 2023 Jul;7(7):1216-1227. doi: 10.1038/s41562-023-01579-9. Epub 2023 Apr 27.
5
Examining the Impact of Polygenic Risk Information in Primary Care.
J Prim Care Community Health. 2023 Jan-Dec;14:21501319231151766. doi: 10.1177/21501319231151766.
6
Returning integrated genomic risk and clinical recommendations: The eMERGE study.
Genet Med. 2023 Apr;25(4):100006. doi: 10.1016/j.gim.2023.100006. Epub 2023 Jan 6.
8
The clinical utility of polygenic risk scores in genomic medicine practices: a systematic review.
Hum Genet. 2022 Nov;141(11):1697-1704. doi: 10.1007/s00439-022-02452-x. Epub 2022 Apr 30.
10
Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses.
Nat Genet. 2021 Aug;53(8):1260-1269. doi: 10.1038/s41588-021-00892-1. Epub 2021 Jul 5.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验