Suppr超能文献

基于机器学习的糖尿病视网膜病变预测模型与风险分析:中国的回顾性队列研究。

Predictive model and risk analysis for diabetic retinopathy using machine learning: a retrospective cohort study in China.

机构信息

Medical School of Chinese PLA, Beijing, China.

Department of Ophthalmology, Chinese PLA General Hospital, Beijing, China.

出版信息

BMJ Open. 2021 Nov 26;11(11):e050989. doi: 10.1136/bmjopen-2021-050989.

Abstract

OBJECTIVE

Aiming to investigate diabetic retinopathy (DR) risk factors and predictive models by machine learning using a large sample dataset.

DESIGN

Retrospective study based on a large sample and a high dimensional database.

SETTING

A Chinese central tertiary hospital in Beijing.

PARTICIPANTS

Information on 32 452 inpatients with type-2 diabetes mellitus (T2DM) were retrieved from the electronic medical record system from 1 January 2013 to 31 December 2017.

METHODS

Sixty variables (including demography information, physical and laboratory measurements, system diseases and insulin treatments) were retained for baseline analysis. The optimal 17 variables were selected by recursive feature elimination. The prediction model was built based on XGBoost algorithm, and it was compared with three other popular machine learning techniques: logistic regression, random forest and support vector machine. In order to explain the results of XGBoost model more visually, the Shapley Additive exPlanation (SHAP) method was used.

RESULTS

DR occurred in 2038 (6.28%) T2DM patients. The XGBoost model was identified as the best prediction model with the highest AUC (area under the curve value, 0.90) and showed that an HbA1c value greater than 8%, nephropathy, a serum creatinine value greater than 100 µmol/L, insulin treatment and diabetic lower extremity arterial disease were associated with an increased risk of DR. A patient's age over 65 was associated with a decreased risk of DR.

CONCLUSIONS

With better comprehensive performance, XGBoost model had high reliability to assess risk indicators of DR. The most critical risk factors of DR and the cut-off of risk factors can be found by SHAP method to render the output of the XGBoost model clinically interpretable.

摘要

目的

旨在使用大型样本数据集通过机器学习研究糖尿病视网膜病变(DR)的危险因素和预测模型。

设计

基于大样本和高维数据库的回顾性研究。

地点

北京一家中国的三级中医院。

参与者

从 2013 年 1 月 1 日至 2017 年 12 月 31 日,从电子病历系统中检索了 32452 例 2 型糖尿病(T2DM)住院患者的信息。

方法

保留了 60 个变量(包括人口统计学信息、体格检查和实验室测量、系统疾病和胰岛素治疗)进行基线分析。通过递归特征消除选择最佳的 17 个变量。基于 XGBoost 算法构建预测模型,并与其他三种流行的机器学习技术(逻辑回归、随机森林和支持向量机)进行比较。为了更直观地解释 XGBoost 模型的结果,使用了 Shapley Additive exPlanation(SHAP)方法。

结果

DR 发生在 2038(6.28%)例 T2DM 患者中。XGBoost 模型被确定为最佳预测模型,具有最高 AUC(曲线下面积值,0.90),结果表明,HbA1c 值大于 8%、肾病、血清肌酐值大于 100μmol/L、胰岛素治疗和糖尿病下肢动脉疾病与 DR 风险增加相关。患者年龄超过 65 岁与 DR 风险降低相关。

结论

XGBoost 模型具有更好的综合性能,可高度可靠地评估 DR 的风险指标。通过 SHAP 方法可以找到 DR 的最关键危险因素和危险因素的临界值,使 XGBoost 模型的输出具有临床可解释性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/514c/8628336/1070397162de/bmjopen-2021-050989f01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验