Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
Graduate Program in Bioengineering, University of California, San Francisco and University of California, Berkeley, San Francisco and Berkeley, CA, USA.
Nat Aging. 2024 Mar;4(3):379-395. doi: 10.1038/s43587-024-00573-8. Epub 2024 Feb 21.
Identification of Alzheimer's disease (AD) onset risk can facilitate interventions before irreversible disease progression. We demonstrate that electronic health records from the University of California, San Francisco, followed by knowledge networks (for example, SPOKE) allow for (1) prediction of AD onset and (2) prioritization of biological hypotheses, and (3) contextualization of sex dimorphism. We trained random forest models and predicted AD onset on a cohort of 749 individuals with AD and 250,545 controls with a mean area under the receiver operating characteristic of 0.72 (7 years prior) to 0.81 (1 day prior). We further harnessed matched cohort models to identify conditions with predictive power before AD onset. Knowledge networks highlight shared genes between multiple top predictors and AD (for example, APOE, ACTB, IL6 and INS). Genetic colocalization analysis supports AD association with hyperlipidemia at the APOE locus, as well as a stronger female AD association with osteoporosis at a locus near MS4A6A. We therefore show how clinical data can be utilized for early AD prediction and identification of personalized biological hypotheses.
阿尔茨海默病(AD)发病风险的识别可以在疾病不可逆转进展之前促进干预措施的实施。我们证明,加利福尼亚大学旧金山分校的电子健康记录,以及随后的知识网络(例如 SPOKE),可以实现以下功能:(1)AD 发病的预测,(2)生物学假说的优先级排序,以及(3)性别二态性的情境化分析。我们训练了随机森林模型,并在一个包含 749 名 AD 患者和 250545 名对照者的队列中进行了 AD 发病的预测,其接收者操作特征曲线下的平均面积为 0.72(7 年前)到 0.81(发病前 1 天)。我们进一步利用匹配队列模型来识别 AD 发病前具有预测能力的疾病。知识网络突出了多个顶级预测因子与 AD 之间的共享基因(例如 APOE、ACTB、IL6 和 INS)。遗传共定位分析支持 APOE 基因座与高血脂症的 AD 相关性,以及在 MS4A6A 附近的基因座上,女性 AD 与骨质疏松症的相关性更强。因此,我们展示了如何利用临床数据进行早期 AD 预测和确定个性化的生物学假说。