Department of Population Health, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, USA.
Behavioral Sciences Group, Sanford Research, 2301 East 60th Street North, Sioux Falls, SD, 57104, USA.
BMC Med Inform Decis Mak. 2021 Mar 31;21(1):111. doi: 10.1186/s12911-021-01474-1.
Diabetes is a medical and economic burden in the United States. In this study, a machine learning predictive model was developed to predict unplanned medical visits among patients with diabetes, and findings were used to design a clinical intervention in the sponsoring healthcare organization. This study presents a case study of how predictive analytics can inform clinical actions, and describes practical factors that must be incorporated in order to translate research into clinical practice.
Data were drawn from electronic medical records (EMRs) from a large healthcare organization in the Northern Plains region of the US, from adult (≥ 18 years old) patients with type 1 or type 2 diabetes who received care at least once during the 3-year period. A variety of machine-learning classification models were run using standard EMR variables as predictors (age, body mass index (BMI), systolic blood pressure (BP), diastolic BP, low-density lipoprotein, high-density lipoprotein (HDL), glycohemoglobin (A1C), smoking status, number of diagnoses and number of prescriptions). The best-performing model after cross-validation testing was analyzed to identify strongest predictors.
The best-performing model was a linear-basis support vector machine, which achieved a balanced accuracy (average of sensitivity and specificity) of 65.7%. This model outperformed a conventional logistic regression by 0.4 percentage points. A sensitivity analysis identified BP and HDL as the strongest predictors, such that disrupting these variables with random noise decreased the model's overall balanced accuracy by 1.3 and 1.4 percentage points, respectively. These recommendations, along with stakeholder engagement, behavioral economics strategies, and implementation science principles helped to inform the design of a clinical intervention targeting behavioral changes.
Our machine-learning predictive model more accurately predicted unplanned medical visits among patients with diabetes, relative to conventional models. Post-hoc analysis of the model was used for hypothesis generation, namely that HDL and BP are the strongest contributors to unplanned medical visits among patients with diabetes. These findings were translated into a clinical intervention now being piloted at the sponsoring healthcare organization. In this way, this predictive model can be used in moving from prediction to implementation and improved diabetes care management in clinical settings.
糖尿病是美国的一个医疗和经济负担。在这项研究中,开发了一种机器学习预测模型,以预测糖尿病患者的非计划性医疗就诊,并利用研究结果在赞助的医疗机构中设计了一项临床干预措施。本研究介绍了预测分析如何为临床决策提供信息的案例,以及描述了将研究转化为临床实践必须纳入的实际因素。
数据来自美国北部平原地区一家大型医疗机构的电子病历(EMR),纳入了在 3 年内至少接受过一次治疗的 1 型或 2 型糖尿病成年(≥18 岁)患者。使用标准的 EMR 变量作为预测因子(年龄、体重指数(BMI)、收缩压(BP)、舒张压(BP)、低密度脂蛋白、高密度脂蛋白(HDL)、糖化血红蛋白(A1C)、吸烟状况、诊断数量和处方数量)运行了多种机器学习分类模型。经过交叉验证测试后,分析表现最佳的模型以确定最强预测因子。
表现最佳的模型是线性基支持向量机,其平衡准确率(敏感性和特异性的平均值)为 65.7%。该模型比传统的逻辑回归模型高出 0.4 个百分点。敏感性分析确定 BP 和 HDL 是最强的预测因子,随机干扰这些变量会使模型的整体平衡准确率分别降低 1.3 和 1.4 个百分点。这些建议,以及利益相关者的参与、行为经济学策略和实施科学原则,有助于为针对行为改变的临床干预措施的设计提供信息。
与传统模型相比,我们的机器学习预测模型更准确地预测了糖尿病患者的非计划性医疗就诊。对模型的事后分析用于生成假设,即 HDL 和 BP 是糖尿病患者非计划性医疗就诊的最强贡献因素。这些发现被转化为赞助医疗机构正在进行试点的临床干预措施。通过这种方式,该预测模型可用于从预测到实施,并改善临床环境中的糖尿病护理管理。