Xu Ying, Western Daniel, Heo Gyujin, Nho Kwangsik, Huang Yen-Ning, Liu Shiwei, Oh Hamilton Se-Hwee, Chen Yike, Timsina Jigyasha, Liu Menghan, Tang Yinxu, Gong Katherine, Budde John, Krish Varsha, Imam Farhad, Fuentes Raquel Puerta, Cano Amanda, Marquie Marta, Boada Merce, Pastor Pau, Ruiz Agustin, Fernández Maria Victoria, Bennett David, Wyss-Coray Tony, Saykin Andrew J, Ali Muhammad, Cruchaga Carlos
Department of Psychiatry, Washington University School of Medicine, St. Louis, MO 63110.
NeuroGenomics and Informatics Center, Washington University School of Medicine, St. Louis, MO 63110.
medRxiv. 2025 Jul 10:2025.07.09.25331192. doi: 10.1101/2025.07.09.25331192.
Neurodegenerative diseases (including Alzheimer's disease, Parkinson's disease, Frontotemporal dementia, and Dementia with Lewy bodies) pose diagnostic challenges due to overlapping pathology and clinical heterogeneity. We leveraged proteomic data from more than 21,000 cerebrospinal fluid and plasma samples to develop and validate explainable, boosting-based multi-disease AI classifiers. The models achieved weighted AUCs in the testing datasets of 0.97 for CSF and 0.88 for plasma, equivalent to traditional biomarkers. The model was validated with neuropathological and clinical data, confirming robust generalizability without any retraining. Using zero-shot learning, we classified disease subtypes including autosomal dominant AD and prodromal PD and clarified disease states for those with conflicting clinical information. The model also showed the ability to prioritize cognitively normal individuals at disease risk. This framework enabled the identification and quantification of continuous, individual-level disease probabilities that allow for the quantification of overlap across diseases and co-pathologies within an individual. Through this work, we establish a benchmark computational framework for enhancing diagnostic precision in NDs.
神经退行性疾病(包括阿尔茨海默病、帕金森病、额颞叶痴呆和路易体痴呆)由于病理重叠和临床异质性而带来诊断挑战。我们利用来自21000多个脑脊液和血浆样本的蛋白质组数据,开发并验证了可解释的、基于增强学习的多疾病人工智能分类器。这些模型在测试数据集中的加权AUC值,脑脊液为0.97,血浆为0.88,与传统生物标志物相当。该模型通过神经病理学和临床数据进行了验证,证实了其强大的通用性,无需任何重新训练。通过零样本学习,我们对疾病亚型进行了分类,包括常染色体显性阿尔茨海默病和前驱帕金森病,并为临床信息相互矛盾的患者明确了疾病状态。该模型还显示出能够对处于疾病风险的认知正常个体进行优先级排序的能力。这个框架能够识别和量化连续的个体水平疾病概率,从而对个体内疾病和共病理之间的重叠进行量化。通过这项工作,我们建立了一个用于提高神经退行性疾病诊断精度的基准计算框架。