Shin Dongyup, Kam Hye Jin, Jeon Min-Seok, Kim Ha Young
Graduate School of Information, Yonsei University, Seoul, Republic of Korea.
Healthcare, Life Solution Cluster, New Business Unit, Hanwha Life Insurance Co Ltd, Seoul, Republic of Korea.
JMIR Med Inform. 2021 Sep 21;9(9):e30223. doi: 10.2196/30223.
In the case of Korean institutions and enterprises that collect nonstandardized and nonunified formats of electronic medical examination results from multiple medical institutions, a group of experienced nurses who can understand the results and related contexts initially classified the reports manually. The classification guidelines were established by years of workers' clinical experiences and there were attempts to automate the classification work. However, there have been problems in which rule-based algorithms or human labor-intensive efforts can be time-consuming or limited owing to high potential errors. We investigated natural language processing (NLP) architectures and proposed ensemble models to create automated classifiers.
This study aimed to develop practical deep learning models with electronic medical records from 284 health care institutions and open-source corpus data sets for automatically classifying 3 thyroid conditions: healthy, caution required, and critical. The primary goal is to increase the overall accuracy of the classification, yet there are practical and industrial needs to correctly predict healthy (negative) thyroid condition data, which are mostly medical examination results, and minimize false-negative rates under the prediction of healthy thyroid conditions.
The data sets included thyroid and comprehensive medical examination reports. The textual data are not only documented in fully complete sentences but also written in lists of words or phrases. Therefore, we propose static and contextualized ensemble NLP network (SCENT) systems to successfully reflect static and contextual information and handle incomplete sentences. We prepared each convolution neural network (CNN)-, long short-term memory (LSTM)-, and efficiently learning an encoder that classifies token replacements accurately (ELECTRA)-based ensemble model by training or fine-tuning them multiple times. Through comprehensive experiments, we propose 2 versions of ensemble models, SCENT-v1 and SCENT-v2, with the single-architecture-based CNN, LSTM, and ELECTRA ensemble models for the best classification performance and practical use, respectively. SCENT-v1 is an ensemble of CNN and ELECTRA ensemble models, and SCENT-v2 is a hierarchical ensemble of CNN, LSTM, and ELECTRA ensemble models. SCENT-v2 first classifies the 3 labels using an ELECTRA ensemble model and then reclassifies them using an ensemble model of CNN and LSTM if the ELECTRA ensemble model predicted them as "healthy" labels.
SCENT-v1 outperformed all the suggested models, with the highest F1 score (92.56%). SCENT-v2 had the second-highest recall value (94.44%) and the fewest misclassifications for caution-required thyroid condition while maintaining 0 classification error for the critical thyroid condition under the prediction of the healthy thyroid condition.
The proposed SCENT demonstrates good classification performance despite the unique characteristics of the Korean language and problems of data lack and imbalance, especially for the extremely low amount of critical condition data. The result of SCENT-v1 indicates that different perspectives of static and contextual input token representations can enhance classification performance. SCENT-v2 has a strong impact on the prediction of healthy thyroid conditions.
对于韩国的机构和企业而言,它们从多个医疗机构收集格式不标准且不统一的电子体检结果,一组经验丰富的护士会先手动对报告进行初步分类,这些护士能够理解体检结果及相关背景信息。分类指南是基于多年工作人员的临床经验制定的,并且曾尝试将分类工作自动化。然而,基于规则的算法或人工密集型工作存在耗时问题,或者由于潜在错误率高而受到限制。我们研究了自然语言处理(NLP)架构,并提出了集成模型来创建自动分类器。
本研究旨在利用来自284家医疗机构的电子病历和开源语料库数据集开发实用的深度学习模型,以自动对三种甲状腺状况进行分类:健康、需谨慎、危急。主要目标是提高分类的整体准确率,同时出于实际和行业需求,要正确预测大多为体检结果的健康(阴性)甲状腺状况数据,并在健康甲状腺状况预测下将假阴性率降至最低。
数据集包括甲状腺和综合体检报告。文本数据不仅以完整句子记录,还以单词或短语列表形式呈现。因此,我们提出静态和上下文集成NLP网络(SCENT)系统,以成功反映静态和上下文信息并处理不完整句子。我们通过多次训练或微调,分别准备了基于卷积神经网络(CNN)、长短期记忆(LSTM)以及能准确分类令牌替换的高效学习编码器(ELECTRA)的集成模型。通过全面实验,我们提出了2个版本的集成模型,即SCENT-v1和SCENT-v2,它们分别基于单架构的CNN、LSTM和ELECTRA集成模型,以实现最佳分类性能和实际应用。SCENT-v1是CNN和ELECTRA集成模型的组合,SCENT-v2是CNN、LSTM和ELECTRA集成模型的分层组合。SCENT-v2首先使用ELECTRA集成模型对3个标签进行分类,然后如果ELECTRA集成模型将其预测为“健康”标签,则使用CNN和LSTM的集成模型对其进行重新分类。
SCENT-v1的表现优于所有建议模型,F1分数最高(92.56%)。SCENT-v2的召回值第二高(94.44%),对于需谨慎的甲状腺状况误分类最少,同时在健康甲状腺状况预测下对危急甲状腺状况保持0分类错误。
尽管韩语具有独特特征以及存在数据缺乏和不平衡问题,尤其是危急状况数据量极少,但所提出的SCENT仍展现出良好的分类性能。SCENT-v1的结果表明,静态和上下文输入令牌表示的不同视角可以提高分类性能。SCENT-v2对健康甲状腺状况的预测有很大影响。