Suppr超能文献

系统医学术语命名法(SNOMED CT)在医疗保健中处理自由文本的应用:系统范围综述。

Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping Review.

机构信息

Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.

Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland.

出版信息

J Med Internet Res. 2021 Jan 26;23(1):e24594. doi: 10.2196/24594.

Abstract

BACKGROUND

Interoperability and secondary use of data is a challenge in health care. Specifically, the reuse of clinical free text remains an unresolved problem. The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) has become the universal language of health care and presents characteristics of a natural language. Its use to represent clinical free text could constitute a solution to improve interoperability.

OBJECTIVE

Although the use of SNOMED and SNOMED CT has already been reviewed, its specific use in processing and representing unstructured data such as clinical free text has not. This review aims to better understand SNOMED CT's use for representing free text in medicine.

METHODS

A scoping review was performed on the topic by searching MEDLINE, Embase, and Web of Science for publications featuring free-text processing and SNOMED CT. A recursive reference review was conducted to broaden the scope of research. The review covered the type of processed data, the targeted language, the goal of the terminology binding, the method used and, when appropriate, the specific software used.

RESULTS

In total, 76 publications were selected for an extensive study. The language targeted by publications was 91% (n=69) English. The most frequent types of documents for which the terminology was used are complementary exam reports (n=18, 24%) and narrative notes (n=16, 21%). Mapping to SNOMED CT was the final goal of the research in 21% (n=16) of publications and a part of the final goal in 33% (n=25). The main objectives of mapping are information extraction (n=44, 39%), feature in a classification task (n=26, 23%), and data normalization (n=23, 20%). The method used was rule-based in 70% (n=53) of publications, hybrid in 11% (n=8), and machine learning in 5% (n=4). In total, 12 different software packages were used to map text to SNOMED CT concepts, the most frequent being Medtex, Mayo Clinic Vocabulary Server, and Medical Text Extraction Reasoning and Mapping System. Full terminology was used in 64% (n=49) of publications, whereas only a subset was used in 30% (n=23) of publications. Postcoordination was proposed in 17% (n=13) of publications, and only 5% (n=4) of publications specifically mentioned the use of the compositional grammar.

CONCLUSIONS

SNOMED CT has been largely used to represent free-text data, most frequently with rule-based approaches, in English. However, currently, there is no easy solution for mapping free text to this terminology and to perform automatic postcoordination. Most solutions conceive SNOMED CT as a simple terminology rather than as a compositional bag of ontologies. Since 2012, the number of publications on this subject per year has decreased. However, the need for formal semantic representation of free text in health care is high, and automatic encoding into a compositional ontology could be a solution.

摘要

背景

互操作性和数据的二次利用是医疗保健领域的一个挑战。具体来说,临床自由文本的再利用仍然是一个未解决的问题。医学系统命名法临床术语(SNOMED CT)已成为医疗保健的通用语言,并具有自然语言的特征。将其用于表示临床自由文本可以构成提高互操作性的解决方案。

目的

尽管已经对 SNOMED 和 SNOMED CT 的使用进行了审查,但尚未对其在处理和表示临床自由文本等非结构化数据方面的具体用途进行审查。本综述旨在更好地了解 SNOMED CT 在医学中表示自由文本的用途。

方法

通过在 MEDLINE、Embase 和 Web of Science 中搜索有关自由文本处理和 SNOMED CT 的出版物,对该主题进行了范围界定审查。通过递归参考审查扩大了研究范围。该审查涵盖了处理数据的类型、目标语言、术语绑定的目标、使用的方法以及在适当情况下使用的特定软件。

结果

共选择了 76 篇论文进行深入研究。出版物的目标语言为 91%(n=69)英语。术语最常使用的文档类型是补充检查报告(n=18,24%)和叙述性笔记(n=16,21%)。在 21%(n=16)的出版物中,将术语映射到 SNOMED CT 是最终目标,在 33%(n=25)的出版物中是最终目标的一部分。映射的主要目标是信息提取(n=44,39%)、分类任务中的特征(n=26,23%)和数据标准化(n=23,20%)。使用的方法在 70%(n=53)的出版物中是基于规则的,在 11%(n=8)的出版物中是混合的,在 5%(n=4)的出版物中是机器学习的。总共使用了 12 个不同的软件包将文本映射到 SNOMED CT 概念,最常用的是 Medtex、Mayo Clinic Vocabulary Server 和 Medical Text Extraction Reasoning and Mapping System。在 64%(n=49)的出版物中使用了完整的术语,而在 30%(n=23)的出版物中仅使用了术语的一个子集。在 17%(n=13)的出版物中提出了后置协调,只有 5%(n=4)的出版物专门提到了使用组合语法。

结论

SNOMED CT 已被广泛用于表示自由文本数据,最常使用基于规则的方法,主要是英语。然而,目前,将自由文本映射到此术语并执行自动后置协调还没有简单的解决方案。大多数解决方案将 SNOMED CT 视为简单的术语,而不是组合的本体袋。自 2012 年以来,每年关于该主题的出版物数量有所减少。然而,医疗保健中对自由文本的正式语义表示的需求很高,自动编码为组合本体可能是一种解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab9/7872838/77e05522e8a0/jmir_v23i1e24594_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验