Chomutare Taridzo, Svenning Therese Olsen, Hernández Miguel Ángel Tejedor, Ngo Phuong Dinh, Budrionis Andrius, Markljung Kaisa, Hind Lill Irene, Torsvik Torbjørn, Mikalsen Karl Øyvind, Babic Aleksandar, Dalianis Hercules
Department of Computer Science, Faculty of Science and Technology, UiT The Arctic University of Norway, Hansine Hansens vei 54, Tromsø, N-9037, Norway, 47 47680032.
Department of Health Data Analytics, Norwegian Centre for E-Health Research, Tromsø, Norway.
J Med Internet Res. 2025 Jul 3;27:e71904. doi: 10.2196/71904.
Clinical coding is critical for hospital reimbursement, quality assessment, and health care planning. In Scandinavia, however, coding is often done by junior doctors or medical secretaries, leading to high rates of coding errors. Artificial intelligence (AI) tools, particularly semiautomatic computer-assisted coding tools, have the potential to reduce the excessive burden of administrative and clinical documentation. To date, much of what we know regarding these tools comes from lab-based evaluations, which often fail to account for real-world complexity and variability in clinical text.
This study aims to investigate whether an AI tool developed by by Norwegian Centre for E-health Research at the University Hospital of North Norway, Easy-ICD (International Classification of Diseases), can enhance clinical coding practices by reducing coding time and improving data quality in a realistic setting. We specifically examined whether improvements differ between long and short clinical notes, defined by word count.
An AI tool, Easy-ICD, was developed to assist clinical coders and was tested for improving both accuracy and time in a 1:1 crossover randomized controlled trial conducted in Sweden and Norway. Participants were randomly assigned to 2 groups (Sequence AB or BA), and crossed over between coding longer texts (Period 1; mean 307, SD 90; words) versus shorter texts (Period 2; mean 166, SD 55; words), while using our tool versus not using our tool. This was a purely web-based trial, where participants were recruited through email. Coding time and accuracy were logged and analyzed using Mann-Whitney U tests for each of the 2 periods independently, due to differing text lengths in each period.
The trial had 17 participants enrolled, but only data from 15 participants (300 coded notes) were analyzed, excluding 2 incomplete records. Based on the Mann-Whitney U test, the median coding time difference for longer clinical text sequences was 123 seconds (P<.001, 95% CI 81-164), representing a 46% reduction in median coding time when our tool was used. For shorter clinical notes, the median time difference of 11 seconds was not significant (P=.25, 95% CI -34 to 8). Coding accuracy improved with Easy-ICD for both longer (62% vs 67%) and shorter clinical notes (60% vs 70%), but these differences were not statistically significant (P=.50and P=.17, respectively). User satisfaction ratings (submitted for 37% of cases) showed slightly higher approval for the tool's suggestions on longer clinical notes.
This study demonstrates the potential of AI to transform common tasks in clinical workflows, with ostensible positive impacts on work efficiencies for clinical coding tasks with more demanding longer text sequences. Further studies within hospital workflows are required before these presumed impacts can be more clearly understood.
临床编码对于医院报销、质量评估和医疗保健规划至关重要。然而,在斯堪的纳维亚半岛,编码工作通常由初级医生或医学秘书完成,导致编码错误率很高。人工智能(AI)工具,尤其是半自动计算机辅助编码工具,有可能减轻行政和临床文档管理的过重负担。迄今为止,我们对这些工具的了解大多来自基于实验室的评估,而这种评估往往没有考虑到临床文本的现实复杂性和变异性。
本研究旨在调查挪威北挪威大学医院电子健康研究中心开发的一种人工智能工具Easy-ICD(国际疾病分类)是否能在实际环境中通过减少编码时间和提高数据质量来改进临床编码实践。我们特别研究了根据字数定义的长、短临床记录之间的改进是否存在差异。
开发了一种人工智能工具Easy-ICD来协助临床编码人员,并在瑞典和挪威进行的1:1交叉随机对照试验中对其提高准确性和缩短时间的效果进行了测试。参与者被随机分为两组(序列AB或BA),在使用我们的工具和不使用我们的工具的情况下,交叉处理较长文本(第1阶段;平均307,标准差90;单词)与较短文本(第2阶段;平均166,标准差55;单词)的编码。这是一项纯基于网络的试验,通过电子邮件招募参与者。由于每个阶段的文本长度不同,分别使用曼-惠特尼U检验对每个阶段的编码时间和准确性进行记录和分析。
该试验招募了17名参与者,但仅分析了15名参与者(300条编码记录)的数据,排除了2条不完整记录。基于曼-惠特尼U检验,较长临床文本序列的编码时间中位数差异为123秒(P<.001,95%置信区间81-164),这意味着使用我们的工具时编码时间中位数减少了46%。对于较短的临床记录,11秒的中位数时间差异不显著(P=.25,95%置信区间-34至8)。使用Easy-ICD时,较长(62%对67%)和较短临床记录(60%对70%)的编码准确性均有所提高,但这些差异无统计学意义(分别为P=.50和P=.17)。用户满意度评分(37%的病例提交了评分)显示,对该工具在较长临床记录上的建议的认可度略高。
本研究证明了人工智能在改变临床工作流程中的常见任务方面的潜力,对于要求更高的较长文本序列的临床编码任务,对工作效率有明显的积极影响。在更清楚地理解这些假定影响之前,需要在医院工作流程中进行进一步研究。