Hansen Lasse, Bernstorff Martin, Enevoldsen Kenneth, Kolding Sara, Damgaard Jakob Grøhn, Perfalk Erik, Nielbo Kristoffer Laigaard, Danielsen Andreas Aalkjær, Østergaard Søren Dinesen
Department of Affective Disorders, Aarhus University Hospital-Psychiatry, Aarhus, Denmark.
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
JAMA Psychiatry. 2025 May 1;82(5):459-469. doi: 10.1001/jamapsychiatry.2024.4702.
The diagnosis of schizophrenia and bipolar disorder is often delayed several years despite illness typically emerging in late adolescence or early adulthood, which impedes initiation of targeted treatment.
To investigate whether machine learning models trained on routine clinical data from electronic health records (EHRs) can predict diagnostic progression to schizophrenia or bipolar disorder among patients undergoing treatment in psychiatric services for other mental illness.
DESIGN, SETTING, AND PARTICIPANTS: This cohort study was based on data from EHRs from the Psychiatric Services of the Central Denmark Region. All patients aged 15 to 60 years with at least 2 contacts (at least 3 months apart) with the Psychiatric Services of the Central Denmark Region between January 1, 2013, and November 21, 2016, were included. Analysis occurred from December 2022 to November 2024.
Predictors based on EHR data, including medications, diagnoses, and clinical notes.
Diagnostic transition to schizophrenia or bipolar disorder within 5 years, predicted 1 day before outpatient contacts by means of elastic net regularized logistic regression and extreme gradient boosting (XGBoost) models. The area under the receiver operating characteristic curve (AUROC) was used to determine the best performing model.
The study included 24 449 patients (median [Q1-Q3] age at time of prediction, 32.2 [24.2-42.5] years; 13 843 female [56.6%]) and 398 922 outpatient contacts. Transition to the first occurrence of either schizophrenia or bipolar disorder was predicted by the XGBoost model, with an AUROC of 0.70 (95% CI, 0.70-0.70) on the training set and 0.64 (95% CI, 0.63-0.65) on the test set, which consisted of 2 held-out hospital sites. At a predicted positive rate of 4%, the XGBoost model had a sensitivity of 9.3%, a specificity of 96.3%, and a positive predictive value (PPV) of 13.0%. Predicting schizophrenia separately yielded better performance (AUROC, 0.80; 95% CI, 0.79-0.81; sensitivity, 19.4%; specificity, 96.3%; PPV, 10.8%) than was the case for bipolar disorder (AUROC, 0.62, 95% CI, 0.61-0.63; sensitivity, 9.9%; specificity, 96.2%; PPV, 8.4%). Clinical notes proved particularly informative for prediction.
These findings suggest that it is possible to predict diagnostic transition to schizophrenia and bipolar disorder from routine clinical data extracted from EHRs, with schizophrenia being notably easier to predict than bipolar disorder.
精神分裂症和双相情感障碍的诊断通常会延迟数年,尽管疾病通常在青春期晚期或成年早期出现,这阻碍了针对性治疗的启动。
研究基于电子健康记录(EHR)中的常规临床数据训练的机器学习模型是否能够预测在精神科接受其他精神疾病治疗的患者发展为精神分裂症或双相情感障碍的诊断进展。
设计、设置和参与者:这项队列研究基于丹麦中部地区精神科服务的EHR数据。纳入了2013年1月1日至2016年11月21日期间年龄在15至60岁之间、与丹麦中部地区精神科服务至少有2次接触(间隔至少3个月)的所有患者。分析于2022年12月至2024年11月进行。
基于EHR数据的预测指标,包括药物治疗、诊断和临床记录。
5年内诊断转变为精神分裂症或双相情感障碍,通过弹性网正则逻辑回归和极端梯度提升(XGBoost)模型在门诊接触前1天进行预测。采用受试者操作特征曲线下面积(AUROC)来确定表现最佳的模型。
该研究纳入了24449名患者(预测时的年龄中位数[四分位间距]为32.2[24.2 - 42.5]岁;13843名女性[56.6%])和398922次门诊接触。XGBoost模型预测了首次出现精神分裂症或双相情感障碍的转变,在训练集上的AUROC为0.70(95%CI,0.70 - 0.70),在由2个预留医院站点组成的测试集上为0.64(95%CI,0.63 - 0.65)。在预测阳性率为4%时,XGBoost模型的灵敏度为9.3%,特异度为96.3%,阳性预测值(PPV)为13.0%。单独预测精神分裂症的表现(AUROC,0.80;95%CI,0.79 - 0.81;灵敏度,19.4%;特异度,96.3%;PPV,10.8%)优于双相情感障碍(AUROC,0.62,95%CI,0.61 - 0.63;灵敏度,9.9%;特异度,96.2%;PPV,8.4%)。临床记录被证明对预测特别有信息价值。
这些发现表明,从EHR中提取的常规临床数据有可能预测诊断转变为精神分裂症和双相情感障碍,其中精神分裂症比双相情感障碍明显更容易预测。