Duke-NUS Medical School, Singapore.
MOH Holdings Private Ltd., Singapore.
J Diabetes Sci Technol. 2023 Mar;17(2):474-489. doi: 10.1177/19322968211056917. Epub 2021 Nov 3.
With the rising prevalence of diabetes, machine learning (ML) models have been increasingly used for prediction of diabetes and its complications, due to their ability to handle large complex data sets. This study aims to evaluate the quality and performance of ML models developed to predict microvascular and macrovascular diabetes complications in an adult Type 2 diabetes population.
A systematic review was conducted in MEDLINE®, Embase®, the Cochrane® Library, Web of Science®, and DBLP Computer Science Bibliography databases according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist. Studies that developed or validated ML prediction models for microvascular or macrovascular complications in people with Type 2 diabetes were included. Prediction performance was evaluated using area under the receiver operating characteristic curve (AUC). An AUC >0.75 indicates clearly useful discrimination performance, while a positive mean relative AUC difference indicates better comparative model performance.
Of 13 606 articles screened, 32 studies comprising 87 ML models were included. Neural networks (n = 15) were the most frequently utilized. Age, duration of diabetes, and body mass index were common predictors in ML models. Across predicted outcomes, 36% of the models demonstrated clearly useful discrimination. Most ML models reported positive mean relative AUC compared with non-ML methods, with random forest showing the best overall performance for microvascular and macrovascular outcomes. Majority (n = 31) of studies had high risk of bias.
Random forest was found to have the overall best prediction performance. Current ML prediction models remain largely exploratory, and external validation studies are required before their clinical implementation.
Open Science Framework (registration number: 10.17605/OSF.IO/UP49X).
随着糖尿病患病率的上升,机器学习 (ML) 模型因其能够处理大型复杂数据集而越来越多地用于预测糖尿病及其并发症。本研究旨在评估用于预测成年 2 型糖尿病人群微血管和大血管糖尿病并发症的 ML 模型的质量和性能。
根据 PRISMA(系统评价和荟萃分析的首选报告项目)检查表,在 MEDLINE®、Embase®、Cochrane® Library、Web of Science® 和 DBLP 计算机科学参考书目数据库中进行了系统评价。纳入了开发或验证用于预测 2 型糖尿病患者微血管或大血管并发症的 ML 预测模型的研究。使用接收者操作特征曲线下的面积 (AUC) 评估预测性能。AUC>0.75 表示明显有用的区分性能,而阳性平均相对 AUC 差异表示更好的比较模型性能。
在筛选出的 13606 篇文章中,有 32 项研究包括 87 个 ML 模型被纳入。神经网络(n=15)是最常用的。年龄、糖尿病病程和体重指数是 ML 模型中常见的预测因素。在预测结果中,有 36%的模型表现出明显有用的区分能力。与非 ML 方法相比,大多数 ML 模型报告了阳性平均相对 AUC,其中随机森林在微血管和大血管结局方面表现出最佳的整体性能。大多数研究(n=31)存在高偏倚风险。
随机森林被发现具有整体最佳的预测性能。当前的 ML 预测模型仍在很大程度上处于探索阶段,在临床实施之前需要进行外部验证研究。
开放科学框架(注册号:10.17605/OSF.IO/UP49X)。