Wallensten Johanna, Wachtler Caroline, Bogdanovic Nenad, Olofsson Anna, Kivipelto Miia, Jönsson Linus, Petrovic Predrag, Carlsson Axel C
Department of Clinical Sciences, Danderyd Hospital, 18288, Stockholm, Sweden; Academic Primary Health Care Centre, Region Stockholm, Sweden.
Academic Primary Health Care Centre, Region Stockholm, Sweden; Division of Family Medicine and Primary Care, Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Alfred Nobels allé 23, 14183 Huddinge, Sweden.
J Prev Alzheimers Dis. 2025 May;12(5):100115. doi: 10.1016/j.tjpad.2025.100115. Epub 2025 Mar 8.
Integrating machine learning with medical records offers potential for early detection of Alzheimer's disease (AD), enabling timely interventions.
This study aimed to evaluate the effectiveness of machine learning in constructing a predictive model for AD, designed to predict AD with data up to three years before diagnosis. Using clinical data, including prior diagnoses and medical treatments, we sought to enhance sensitivity and specificity in diagnostic procedures. A second aim was to identify the most important factors in the machine learning models, as these may be important predictors of AD.
The study employed Stochastic Gradient Boosting, a machine learning method, to identify diagnoses predictive of AD using primary healthcare data. The analyses were stratified by sex and age groups.
The study included individuals within Region Stockholm, Sweden, using medical records from 2010 to 2022.
The study analyzed clinical data for individuals over the age of 40. Patients with an AD diagnosis (ICD-10-SE codes F00 or G30) during 2010-2012 were excluded to ensure prospective modeling. In total, AD was identified in 3,407 patients aged 41-69 years and 25,796 patients aged over 69.
The machine learning model ranked predictive diagnoses, with performance assessed by the area under the receiver operating characteristic curve (AUC). Known and novel predictors were evaluated for their contribution to AD risk.
AUC values ranged from 0.748 (women aged 41-69) to 0.816 (women over 69), with men across age groups falling within this range. Sensitivity and specificity ranged from 0.73 to 0.79 and 0.66 to 0.79, respectively, across age and gender groups. Negative predictive values were consistently high (≥0.954), while positive predictive values were lower (0.199-0.351). Additionally, we confirmed known risk factors as predictors and identified novel predictors that warrant further investigation. Key predictors included medical observations, cognitive symptoms, antidepressant treatment, visit frequency, and vitamin B12/folic acid treatment.
Machine learning applied to clinical data shows promise in predicting AD, with robust model performance across age and sex groups. The findings confirmed known risk factors, such as depression and vitamin B12 deficiency, while also identifying novel predictors that may guide future research. Clinically, this approach could enhance early detection and risk stratification, facilitating timely interventions and improving patient outcomes.
将机器学习与医疗记录相结合为早期发现阿尔茨海默病(AD)提供了可能,从而能够及时进行干预。
本研究旨在评估机器学习在构建AD预测模型方面的有效性,该模型旨在利用诊断前长达三年的数据预测AD。我们使用包括既往诊断和治疗在内的临床数据,力求提高诊断程序的敏感性和特异性。第二个目的是确定机器学习模型中最重要的因素,因为这些因素可能是AD的重要预测指标。
本研究采用随机梯度提升这一机器学习方法,利用初级医疗保健数据识别可预测AD的诊断。分析按性别和年龄组进行分层。
该研究纳入了瑞典斯德哥尔摩地区的个体,使用了2010年至2022年的医疗记录。
该研究分析了40岁以上个体的临床数据。2010 - 2012年期间被诊断为AD(ICD - 10 - SE编码F00或G30)的患者被排除,以确保进行前瞻性建模。总共在3407名41 - 69岁的患者和25796名69岁以上的患者中确诊了AD。
机器学习模型对预测诊断进行排名,其性能通过受试者操作特征曲线下面积(AUC)进行评估。评估已知和新的预测指标对AD风险的贡献。
AUC值范围从0.748(41 - 69岁女性)到0.816(69岁以上女性),各年龄组男性的AUC值在此范围内。各年龄和性别组的敏感性和特异性分别为0.73至0.79和0.6六至0.79。阴性预测值一直很高(≥0.954),而阳性预测值较低(0.199 - 0.351)。此外,我们确认了已知风险因素作为预测指标,并识别出了值得进一步研究的新预测指标。关键预测指标包括医学观察、认知症状、抗抑郁治疗、就诊频率以及维生素B12/叶酸治疗。
将机器学习应用于临床数据在预测AD方面显示出前景,在各年龄和性别组中模型性能稳健。研究结果证实了已知风险因素,如抑郁症和维生素B12缺乏,同时也识别出了可能指导未来研究的新预测指标。在临床上,这种方法可以加强早期检测和风险分层,促进及时干预并改善患者预后。