Merritt Victoria C, Zhang Rui, Sherva Richard, Ly Monica T, Marra David, Panizzon Matthew S, Tsuang Debby W, Hauger Richard L, Logue Mark W
Research Service, VA San Diego Healthcare System, San Diego, USA.
Department of Psychiatry, School of Medicine, University of California, San Diego, La Jolla, USA.
J Alzheimers Dis. 2025 Jan;103(1):180-193. doi: 10.1177/13872877241299130. Epub 2024 Dec 18.
The age distribution and diversity of the VA Million Veteran Program (MVP) cohort make it a valuable resource for studying the genetics of Alzheimer's disease (AD) and related dementias (ADRD).
We present and evaluate the performance of several International Classification of Diseases (ICD) code-based classification algorithms for AD, ADRD, and dementia for use in MVP genetic studies and other studies using VA electronic medical record (EMR) data. These were benchmarked relative to existing ICD algorithms and AD-medication-identified cases.
We used chart review of n = 103 MVP participants to evaluate diagnostic utility of the algorithms. Suitability for genetic studies was examined by assessing association with ε4, the strongest genetic AD risk factor, in a large MVP cohort (n = 286 K).
The newly developed MVP-ADRD algorithm performed well, comparable to the existing PheCode dementia algorithm (Phe-Dementia) in terms of sensitivity (0.95 and 0.95) and specificity (0.65 and 0.70). The strongest ε4 associations were observed in cases identified using MVP-ADRD and Phe-Dementia augmented with medication-identified cases (MVP-ADRD medication, p = 3.6 ×10; Phe-Dementia medication, p = 1.4 ×10). Performance was improved when cases were restricted to those with onset age ≥60.
We found that our MVP-developed ICD-based algorithms had good performance in chart review and generated strong genetic signals, especially after inclusion of medication-identified cases. Ultimately, our MVP-derived algorithms are likely to have good performance in the broader VA, and their performance may also be suitable for use in other large-scale EMR-based biobanks in the absence of definitive biomarkers such as amyloid-PET and cerebrospinal fluid biomarkers.
美国退伍军人事务部百万退伍军人计划(MVP)队列的年龄分布和多样性使其成为研究阿尔茨海默病(AD)及相关痴呆症(ADRD)遗传学的宝贵资源。
我们展示并评估了几种基于国际疾病分类(ICD)编码的AD、ADRD和痴呆症分类算法在MVP基因研究及其他使用退伍军人事务部电子病历(EMR)数据的研究中的性能。这些算法相对于现有的ICD算法和经AD药物识别的病例进行了基准测试。
我们对n = 103名MVP参与者进行了病历审查,以评估算法的诊断效用。通过在一个大型MVP队列(n = 286K)中评估与最强的AD遗传风险因素ε4的关联,来检验算法对基因研究的适用性。
新开发的MVP - ADRD算法表现良好,在敏感性(分别为0.95和0.95)和特异性(分别为0.65和0.70)方面与现有的PheCode痴呆症算法(Phe - Dementia)相当。在使用MVP - ADRD和添加了经药物识别病例的Phe - Dementia识别出的病例中观察到最强的ε4关联(MVP - ADRD + 药物,p = 3.6×10;Phe - Dementia + 药物,p = 1.4×10)。当病例限于发病年龄≥60岁时,性能有所提高。
我们发现我们基于MVP开发且基于ICD的算法在病历审查中表现良好,并产生了强烈的遗传信号,特别是在纳入经药物识别的病例之后。最终,我们源自MVP的算法在更广泛的退伍军人事务部中可能具有良好的性能,并且在没有淀粉样蛋白PET和脑脊液生物标志物等确定性生物标志物的情况下,它们的性能也可能适用于其他基于大规模EMR的生物样本库。