Peng Jiajie, Bao Zhijie, Li Jingyi, Han Ruijiang, Wang Yuxian, Han Lu, Peng Jinghao, Wang Tao, Hao Jianye, Wei Zhongyu, Shang Xuequn
AI for Science Interdisciplinary Research Center, School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China.
Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710129, China.
Fundam Res. 2024 Mar 19;4(4):752-760. doi: 10.1016/j.fmre.2024.02.015. eCollection 2024 Jul.
The potential for being able to identify individuals at high disease risk solely based on genotype data has garnered significant interest. Although widely applied, traditional polygenic risk scoring methods fall short, as they are built on additive models that fail to capture the intricate associations among single nucleotide polymorphisms (SNPs). This presents a limitation, as genetic diseases often arise from complex interactions between multiple SNPs. To address this challenge, we developed DeepRisk, a biological knowledge-driven deep learning method for modeling these complex, nonlinear associations among SNPs, to provide a more effective method for scoring the risk of common diseases with genome-wide genotype data. Evaluations demonstrated that DeepRisk outperforms existing PRS-based methods in identifying individuals at high risk for four common diseases: Alzheimer's disease, inflammatory bowel disease, type 2 diabetes, and breast cancer.
仅基于基因型数据就能识别高疾病风险个体的可能性引起了广泛关注。尽管传统的多基因风险评分方法得到了广泛应用,但它们存在不足,因为这些方法建立在加法模型之上,无法捕捉单核苷酸多态性(SNP)之间的复杂关联。这是一个局限性,因为遗传疾病通常源于多个SNP之间的复杂相互作用。为应对这一挑战,我们开发了DeepRisk,这是一种基于生物学知识的深度学习方法,用于对SNP之间的这些复杂非线性关联进行建模,以便为利用全基因组基因型数据对常见疾病风险进行评分提供一种更有效的方法。评估表明,在识别阿尔茨海默病、炎症性肠病、2型糖尿病和乳腺癌这四种常见疾病的高风险个体方面,DeepRisk优于现有的基于PRS的方法。