Suppr超能文献

检测患者安全信息中报告的低血糖事件:使用成本敏感学习和过采样来减少数据不平衡

Detecting Hypoglycemia Incidents Reported in Patients' Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance.

作者信息

Chen Jinying, Lalor John, Liu Weisong, Druhl Emily, Granillo Edgard, Vimalananda Varsha G, Yu Hong

机构信息

Department of Population and Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, United States.

Bedford Veterans Affairs Medical Center, Center for Healthcare Organization and Implementation Research, Bedford, MA, United States.

出版信息

J Med Internet Res. 2019 Mar 11;21(3):e11990. doi: 10.2196/11990.

Abstract

BACKGROUND

Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety.

OBJECTIVE

We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector), to automatically identify hypoglycemia incidents reported in patients' secure messages.

METHODS

An expert in public health annotated 3000 secure message threads between patients with diabetes and US Department of Veterans Affairs clinical teams as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset to determine interannotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates 3 machine learning algorithms widely used for text classification: linear support vector machines, random forest, and logistic regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.80%) messages were annotated as positive, we investigated cost-sensitive learning and oversampling methods to mitigate the challenge of imbalanced data.

RESULTS

The interannotator agreement was Cohen kappa=.976. Using cross-validation, logistic regression with cost-sensitive learning achieved the best performance (area under the receiver operating characteristic curve=0.954, sensitivity=0.693, specificity 0.974, F1 score=0.590). Cost-sensitive learning and the ensembled synthetic minority oversampling technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect.

CONCLUSIONS

Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia.

摘要

背景

胰岛素等药物剂量不当会引发低血糖事件,这可能导致严重发病甚至死亡。尽管安全消息传递旨在交换非紧急消息,但患者有时会通过安全消息传递报告低血糖事件。检测这些患者报告的不良事件可能有助于提醒临床团队并促使采取早期纠正措施,以提高患者安全性。

目的

我们旨在开发一个名为HypoDetect(低血糖检测器)的自然语言处理系统,以自动识别患者安全消息中报告的低血糖事件。

方法

一位公共卫生专家对3000条糖尿病患者与美国退伍军人事务部临床团队之间的安全消息线程进行注释,判断是否包含患者报告的低血糖事件。一名医生从该数据集中随机选择100个线程进行独立注释,以确定注释者间的一致性。我们使用该数据集来开发和评估HypoDetect。HypoDetect纳入了3种广泛用于文本分类的机器学习算法:线性支持向量机、随机森林和逻辑回归。我们探索了不同的学习特征,包括新的知识驱动特征。由于只有114条(3.80%)消息被注释为阳性,我们研究了成本敏感学习和过采样方法,以应对数据不平衡的挑战。

结果

注释者间的一致性为Cohen kappa = 0.976。使用交叉验证,采用成本敏感学习的逻辑回归表现最佳(受试者工作特征曲线下面积 = 0.954,灵敏度 = 0.693,特异性 = 0.974,F1分数 = 0.590)。成本敏感学习和集成合成少数过采样技术大幅提高了基线系统(绝对增益0.123至0.728)的灵敏度。我们的结果表明,多种特征促成了HypoDetect的最佳性能。

结论

尽管存在数据不平衡的挑战,但HypoDetect在从安全消息中检测低血糖事件的任务上取得了有前景的结果。该系统在促进低血糖的早期检测和治疗方面具有巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/657d/6431826/760b8ae3b3dc/jmir_v21i3e11990_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验