Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA.
Division of Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA.
Ann Clin Transl Neurol. 2021 Apr;8(4):800-810. doi: 10.1002/acn3.51324. Epub 2021 Feb 24.
No relapse risk prediction tool is currently available to guide treatment selection for multiple sclerosis (MS). Leveraging electronic health record (EHR) data readily available at the point of care, we developed a clinical tool for predicting MS relapse risk.
Using data from a clinic-based research registry and linked EHR system between 2006 and 2016, we developed models predicting relapse events from the registry in a training set (n = 1435) and tested the model performance in an independent validation set of MS patients (n = 186). This iterative process identified prior 1-year relapse history as a key predictor of future relapse but ascertaining relapse history through the labor-intensive chart review is impractical. We pursued two-stage algorithm development: (1) L -regularized logistic regression (LASSO) to phenotype past 1-year relapse status from contemporaneous EHR data, (2) LASSO to predict future 1-year relapse risk using imputed prior 1-year relapse status and other algorithm-selected features.
The final model, comprising age, disease duration, and imputed prior 1-year relapse history, achieved a predictive AUC and F score of 0.707 and 0.307, respectively. The performance was significantly better than the baseline model (age, sex, race/ethnicity, and disease duration) and noninferior to a model containing actual prior 1-year relapse history. The predicted risk probability declined with disease duration and age.
Our novel machine-learning algorithm predicts 1-year MS relapse with accuracy comparable to other clinical prediction tools and has applicability at the point of care. This EHR-based two-stage approach of outcome prediction may have application to neurological disease beyond MS.
目前尚无复发风险预测工具可用于指导多发性硬化症(MS)的治疗选择。利用在护理点可获得的电子健康记录(EHR)数据,我们开发了一种用于预测 MS 复发风险的临床工具。
使用 2006 年至 2016 年期间基于诊所的研究登记处和相关的 EHR 系统的数据,我们在训练集中(n=1435)开发了一种从登记处预测复发事件的模型,并在独立的 MS 患者验证集中(n=186)测试了该模型的性能。这一迭代过程确定了前 1 年的复发史是未来复发的关键预测指标,但通过繁琐的图表审查来确定复发史是不切实际的。我们进行了两阶段的算法开发:(1)L 正则化逻辑回归(LASSO),根据同期的 EHR 数据来表现过去 1 年的复发状态,(2)LASSO ,使用推断的过去 1 年的复发状态和其他算法选择的特征来预测未来 1 年的复发风险。
最终模型由年龄、疾病持续时间和推断的过去 1 年的复发史组成,其预测 AUC 和 F 分数分别为 0.707 和 0.307。该性能明显优于基线模型(年龄、性别、种族/民族和疾病持续时间),且与包含实际过去 1 年复发史的模型相当。预测的风险概率随疾病持续时间和年龄而降低。
我们的新型机器学习算法可以准确预测 1 年的 MS 复发,其准确性可与其他临床预测工具相媲美,并可在护理点应用。这种基于 EHR 的两阶段结果预测方法可能适用于 MS 以外的神经疾病。