Masud Jakir Hossain Bhuiyan, Shun Chiang, Kuo Chen-Cheng, Islam Md Mohaimenul, Yeh Chih-Yang, Yang Hsuan-Chia, Lin Ming-Chin
Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan.
Department of Otolaryngology, Shuang Ho Hospital, Taipei Medical University, New Taipei City 23561, Taiwan.
J Pers Med. 2022 Apr 28;12(5):707. doi: 10.3390/jpm12050707.
Currently, the International Classification of Diseases (ICD) codes are being used to improve clinical, financial, and administrative performance. Inaccurate ICD coding can lower the quality of care, and delay or prevent reimbursement. However, selecting the appropriate ICD code from a patient's clinical history is time-consuming and requires expert knowledge. The rapid spread of electronic medical records (EMRs) has generated a large amount of clinical data and provides an opportunity to predict ICD codes using deep learning models. The main objective of this study was to use a deep learning-based natural language processing (NLP) model to accurately predict ICD-10 codes, which could help providers to make better clinical decisions and improve their level of service. We retrospectively collected clinical notes from five outpatient departments (OPD) from one university teaching hospital between January 2016 and December 2016. We applied NLP techniques, including global vectors, word to vectors, and embedding techniques to process the data. The dataset was split into two independent training and testing datasets consisting of 90% and 10% of the entire dataset, respectively. A convolutional neural network (CNN) model was developed, and the performance was measured using the precision, recall, and F-score. A total of 21,953 medical records were collected from 5016 patients. The performance of the CNN model for the five different departments was clinically satisfactory (Precision: 0.500.69 and recall: 0.780.91). However, the CNN model achieved the best performance for the cardiology department, with a precision of 69%, a recall of 89% and an F-score of 78%. The CNN model for predicting ICD-10 codes provides an opportunity to improve the quality of care. Implementing this model in real-world clinical settings could reduce the manual coding workload, enhance the efficiency of clinical coding, and support physicians in making better clinical decisions.
目前,国际疾病分类(ICD)编码正被用于改善临床、财务和行政管理绩效。不准确的ICD编码会降低医疗质量,并延迟或阻碍报销。然而,从患者的临床病史中选择合适的ICD编码既耗时又需要专业知识。电子病历(EMR)的迅速普及产生了大量临床数据,并为使用深度学习模型预测ICD编码提供了契机。本研究的主要目的是使用基于深度学习的自然语言处理(NLP)模型准确预测ICD-10编码,这有助于医疗服务提供者做出更好的临床决策并提高服务水平。我们回顾性收集了一所大学教学医院2016年1月至2016年12月期间五个门诊部(OPD)的临床记录。我们应用了NLP技术,包括全局向量、词向量和嵌入技术来处理数据。数据集被分为两个独立的训练和测试数据集,分别占整个数据集的90%和10%。开发了一个卷积神经网络(CNN)模型,并使用精确率、召回率和F值来衡量其性能。共收集了来自5016名患者的21953份病历。CNN模型在五个不同科室的表现从临床角度来看是令人满意的(精确率:0.500.69,召回率:0.780.91)。然而,CNN模型在心脏病科表现最佳,精确率为69%,召回率为89%,F值为78%。用于预测ICD-10编码的CNN模型为提高医疗质量提供了契机。在实际临床环境中应用该模型可以减少人工编码工作量,提高临床编码效率,并支持医生做出更好的临床决策。