基于电子健康记录的机器学习预测2型糖尿病患者低血糖风险:开发与验证

Predicting Risk of Hypoglycemia in Patients With Type 2 Diabetes by Electronic Health Record-Based Machine Learning: Development and Validation.

作者信息

Yang Hao, Li Jiaxi, Liu Siru, Yang Xiaoling, Liu Jialin

机构信息

Information Center, West China Hospital, Sichuan University, Chengdu, China.

Department of Clinical Laboratory Medicine, Jinniu Maternity and Child Health Hospital of Chengdu, Chengdu, China.

出版信息

JMIR Med Inform. 2022 Jun 16;10(6):e36958. doi: 10.2196/36958.

Abstract

BACKGROUND

Hypoglycemia is a common adverse event in the treatment of diabetes. To efficiently cope with hypoglycemia, effective hypoglycemia prediction models need to be developed.

OBJECTIVE

The aim of this study was to develop and validate machine learning models to predict the risk of hypoglycemia in adult patients with type 2 diabetes.

METHODS

We used the electronic health records of all adult patients with type 2 diabetes admitted to West China Hospital between November 2019 and December 2021. The prediction model was developed based on XGBoost and natural language processing. F1 score, area under the receiver operating characteristic curve (AUC), and decision curve analysis (DCA) were used as the main criteria to evaluate model performance.

RESULTS

We included 29,843 patients with type 2 diabetes, of whom 2804 patients (9.4%) developed hypoglycemia. In this study, the embedding machine learning model (XGBoost3) showed the best performance among all the models. The AUC and the accuracy of XGBoost are 0.82 and 0.93, respectively. The XGboost3 was also superior to other models in DCA.

CONCLUSIONS

The Paragraph Vector-Distributed Memory model can effectively extract features and improve the performance of the XGBoost model, which can then effectively predict hypoglycemia in patients with type 2 diabetes.

摘要

背景

低血糖是糖尿病治疗中常见的不良事件。为有效应对低血糖,需要开发有效的低血糖预测模型。

目的

本研究旨在开发并验证机器学习模型,以预测成年2型糖尿病患者的低血糖风险。

方法

我们使用了2019年11月至2021年12月期间入住华西医院的所有成年2型糖尿病患者的电子健康记录。基于XGBoost和自然语言处理开发预测模型。F1分数、受试者操作特征曲线下面积(AUC)和决策曲线分析(DCA)用作评估模型性能的主要标准。

结果

我们纳入了29843例2型糖尿病患者,其中2804例(9.4%)发生了低血糖。在本研究中,嵌入机器学习模型(XGBoost3)在所有模型中表现最佳。XGBoost的AUC和准确率分别为0.82和0.93。XGboost3在DCA方面也优于其他模型。

结论

段落向量分布式内存模型可以有效提取特征并提高XGBoost模型的性能,进而有效预测2型糖尿病患者的低血糖。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db54/9247813/4b37ada644cd/medinform_v10i6e36958_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索