Zhong Xue, Jia Gengjie, Yin Zhijun, Cheng Kerou, Rzhetsky Andrey, Li Bingshan, Cox Nancy J
Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN.
Department of Medicine, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL.
medRxiv. 2025 Mar 24:2025.03.22.25324197. doi: 10.1101/2025.03.22.25324197.
Several health conditions are known to increase the risk of Alzheimer's disease (AD). We aim to systematically identify medical conditions that are associated with subsequent development of AD by leveraging the growing resources of electronic health records (EHRs).
This retrospective cohort study used de-identified EHRs from two independent databases (MarketScan and VUMC) with 153 million individuals to identify AD cases and age- and gender-matched controls. By tracking their EHRs over a 10-year window before AD diagnosis and comparing the EHRs between AD cases and controls, we identified medical conditions that occur more likely in those who later develop AD. We further assessed the genetic underpinnings of these conditions in relation to AD genetics using data from two large-scale biobanks (BioVU and UK Biobank, total N=450,000).
We identified 43,508 AD cases and 419,455 matched controls in MarketScan, and 1,320 AD cases and 12,720 matched controls in VUMC. We detected 406 and 102 medical phenotypes that are significantly enriched among the future AD cases in MarketScan and VUMC databases, respectively. In both EHR databases, mental disorders and neurological disorders emerged as the top two most enriched clinical categories. More than 70 medical phenotypes are replicated in both EHR databases, which are dominated by mental disorders (e.g., depression), neurological disorders (e.g., sleep orders), circulatory system disorders (e.g. cerebral atherosclerosis) and endocrine/metabolic disorders (e.g., type 2 diabetes). We identified 19 phenotypes that are either associated with individual risk variants of AD or a polygenic risk score of AD.
In this study, analysis of longitudinal EHRs from independent large-scale databases enables robust identification of health conditions associated with subsequent development of AD, highlighting potential opportunities of therapeutics and interventions to reduce AD risk.
已知多种健康状况会增加患阿尔茨海默病(AD)的风险。我们旨在通过利用电子健康记录(EHR)不断增长的资源,系统地识别与AD后续发展相关的医疗状况。
这项回顾性队列研究使用了来自两个独立数据库(MarketScan和范德堡大学医学中心(VUMC))的去识别化EHR,涉及1.53亿人,以确定AD病例以及年龄和性别匹配的对照。通过在AD诊断前的10年窗口内跟踪他们的EHR,并比较AD病例和对照之间的EHR,我们确定了那些后来患AD的人更可能出现的医疗状况。我们使用来自两个大型生物样本库(BioVU和英国生物样本库,总计N = 450,000)的数据,进一步评估了这些状况与AD遗传学相关的遗传基础。
我们在MarketScan中确定了43,508例AD病例和419,455例匹配对照,在VUMC中确定了1,320例AD病例和12,720例匹配对照。我们分别在MarketScan和VUMC数据库中检测到406种和102种医疗表型,这些表型在未来的AD病例中显著富集。在两个EHR数据库中,精神障碍和神经障碍均成为最富集的前两类临床类别。超过70种医疗表型在两个EHR数据库中都得到了重复,其中以精神障碍(如抑郁症)、神经障碍(如睡眠障碍)、循环系统障碍(如脑动脉粥样硬化)和内分泌/代谢障碍(如2型糖尿病)为主。我们确定了19种表型,它们要么与AD的个体风险变异相关,要么与AD的多基因风险评分相关。
在本研究中,对来自独立大型数据库的纵向EHR进行分析,能够可靠地识别与AD后续发展相关的健康状况,突出了降低AD风险的治疗和干预的潜在机会。