Suppr超能文献

使用学习到的编辑模式和子概念匹配进行临床术语标准化:系统开发与评估

Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation.

作者信息

Kate Rohit J

机构信息

Department of Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI, United States.

出版信息

JMIR Med Inform. 2021 Jan 14;9(1):e23104. doi: 10.2196/23104.

Abstract

BACKGROUND

Clinical terms mentioned in clinical text are often not in their standardized forms as listed in clinical terminologies because of linguistic and stylistic variations. However, many automated downstream applications require clinical terms mapped to their corresponding concepts in clinical terminologies, thus necessitating the task of clinical term normalization.

OBJECTIVE

In this paper, a system for clinical term normalization is presented that utilizes edit patterns to convert clinical terms into their normalized forms.

METHODS

The edit patterns are automatically learned from the Unified Medical Language System (UMLS) Metathesaurus as well as from the given training data. The edit patterns are generalized sequences of edits that are derived from edit distance computations. The edit patterns are both character based as well as word based and are learned separately for different semantic types. In addition to these edit patterns, the system also normalizes clinical terms through the subconcepts mentioned within them.

RESULTS

The system was evaluated as part of the 2019 n2c2 Track 3 shared task of clinical term normalization. It obtained 80.79% accuracy on the standard test data. This paper includes ablation studies to evaluate the contributions of different components of the system. A challenging part of the task was disambiguation when a clinical term could be normalized to multiple concepts.

CONCLUSIONS

The learned edit patterns led the system to perform well on the normalization task. Given that the system is based on patterns, it is human interpretable and is also capable of giving insights about common variations of clinical terms mentioned in clinical text that are different from their standardized forms.

摘要

背景

由于语言和文体的变化,临床文本中提及的临床术语往往并非临床术语表中列出的标准化形式。然而,许多自动化的下游应用需要将临床术语映射到临床术语表中的相应概念,因此需要进行临床术语规范化任务。

目的

本文提出一种临床术语规范化系统,该系统利用编辑模式将临床术语转换为其规范化形式。

方法

编辑模式是从统一医学语言系统(UMLS)元词表以及给定的训练数据中自动学习得到的。编辑模式是从编辑距离计算中派生出来的编辑的广义序列。编辑模式既有基于字符的,也有基于单词的,并且针对不同的语义类型分别进行学习。除了这些编辑模式外,该系统还通过临床术语中提到的子概念对临床术语进行规范化。

结果

该系统作为2019年n2c2临床术语规范化共享任务第3赛道的一部分进行了评估。在标准测试数据上,它获得了80.79%的准确率。本文包括消融研究,以评估系统不同组件的贡献。当一个临床术语可以规范化为多个概念时,任务中一个具有挑战性的部分是消歧。

结论

所学习的编辑模式使系统在规范化任务中表现良好。鉴于该系统基于模式,它具有人类可解释性,并且还能够深入了解临床文本中提到的与标准化形式不同的临床术语的常见变体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/86e7/7843202/4cbd1b002d8b/medinform_v9i1e23104_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验