Center for Outcomes Research and Clinical Epidemiology - CORESEARCH, Pescara, Italy.
Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy.
Diabetes Res Clin Pract. 2022 Aug;190:110013. doi: 10.1016/j.diabres.2022.110013. Epub 2022 Jul 21.
To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records.
Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3-5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC).
For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy).
Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.
通过大数据机器学习,基于电子病历构建糖尿病并发症(DCs)预测模型。
考虑了 6 组 DCs:眼部并发症、心血管疾病、脑血管疾病和外周血管疾病、肾病、糖尿病神经病变。采用有监督的、基于树的学习方法(XGBoost)来预测每一种并发症在 5 年内的发病情况(任务 1)。此外,还对并发症的早发(2 年内)和晚发(3-5 年内)进行了单独预测(任务 2)。使用了一个由 23 个中心在 15 年内观察到的 147664 名患者组成的数据集,并在另外 5 个中心进行了外部验证。通过考虑准确性、敏感性、特异性和 ROC 曲线下的面积(AUC)来评估模型。
对于所有考虑的 DCs,任务 1 中的预测模型的准确性>70%,AUC 大多超过 0.80,肾病的 AUC 达到 0.97。对于任务 2,所有预测模型的准确性>70%,AUC>0.85。预测并发症早期发生的敏感性在 83.2%(外周血管疾病)至 88.5%(肾病)之间。
机器学习方法提供了识别并发症风险较高患者的机会。这有助于克服临床惰性,提高糖尿病护理质量。