Jing Xia
Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, United States.
JMIR Med Inform. 2021 Aug 27;9(8):e20675. doi: 10.2196/20675.
The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together many broadly used vocabularies and standards in the biomedical field to facilitate interoperability among different computer systems and applications.
Despite its longevity, there is no comprehensive publication analysis of the use of the UMLS. Thus, this review and analysis is conducted to provide an overview of the UMLS and its use in English-language peer-reviewed publications, with the objective of providing a comprehensive understanding of how the UMLS has been used in English-language peer-reviewed publications over the last 30 years.
PubMed, ACM Digital Library, and the Nursing & Allied Health Database were used to search for studies. The primary search strategy was as follows: UMLS was used as a Medical Subject Headings term or a keyword or appeared in the title or abstract. Only English-language publications were considered. The publications were screened first, then coded and categorized iteratively, following the grounded theory. The review process followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines.
A total of 943 publications were included in the final analysis. Moreover, 32 publications were categorized into 2 categories; hence the total number of publications before duplicates are removed is 975. After analysis and categorization of the publications, UMLS was found to be used in the following emerging themes or areas (the number of publications and their respective percentages are given in parentheses): natural language processing (230/975, 23.6%), information retrieval (125/975, 12.8%), terminology study (90/975, 9.2%), ontology and modeling (80/975, 8.2%), medical subdomains (76/975, 7.8%), other language studies (53/975, 5.4%), artificial intelligence tools and applications (46/975, 4.7%), patient care (35/975, 3.6%), data mining and knowledge discovery (25/975, 2.6%), medical education (20/975, 2.1%), degree-related theses (13/975, 1.3%), digital library (5/975, 0.5%), and the UMLS itself (150/975, 15.4%), as well as the UMLS for other purposes (27/975, 2.8%).
The UMLS has been used successfully in patient care, medical education, digital libraries, and software development, as originally planned, as well as in degree-related theses, the building of artificial intelligence tools, data mining and knowledge discovery, foundational work in methodology, and middle layers that may lead to advanced products. Natural language processing, the UMLS itself, and information retrieval are the 3 most common themes that emerged among the included publications. The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.
统一医学语言系统(UMLS)一直是生物医学和健康信息学领域的关键工具,2021年是其成立30周年。UMLS汇集了生物医学领域许多广泛使用的词汇表和标准,以促进不同计算机系统和应用程序之间的互操作性。
尽管UMLS存在已久,但尚无对其使用情况的全面出版物分析。因此,进行本综述和分析以概述UMLS及其在英文同行评审出版物中的使用情况,目的是全面了解UMLS在过去30年的英文同行评审出版物中是如何被使用的。
使用PubMed、ACM数字图书馆和护理与相关健康数据库搜索研究。主要搜索策略如下:UMLS用作医学主题词或关键词,或出现在标题或摘要中。仅考虑英文出版物。首先对出版物进行筛选,然后按照扎根理论进行迭代编码和分类。综述过程遵循PRISMA(系统评价和荟萃分析的首选报告项目)指南。
最终分析共纳入943篇出版物。此外,32篇出版物被分为2类;因此,去除重复之前的出版物总数为975篇。对出版物进行分析和分类后,发现UMLS用于以下新兴主题或领域(括号内为出版物数量及其各自百分比):自然语言处理(230/975,23.6%)、信息检索(125/975,12.8%)、术语研究(90/975,9.2%)、本体和建模(80/975,8.2%)、医学子领域(76/975,7.8%)、其他语言研究(53/975,5.4%)、人工智能工具和应用(46/975,4.7%)、患者护理(35/975,3.6%)、数据挖掘和知识发现(25/975,2.6%)、医学教育(20/975,2.1%)、学位相关论文(13/975,1.3%)、数字图书馆(5/975,0.5%)、UMLS本身(150/975,15.4%)以及UMLS用于其他目的(27/975,2.8%)。
UMLS已按原计划成功应用于患者护理、医学教育、数字图书馆和软件开发,以及学位相关论文、人工智能工具构建、数据挖掘和知识发现、方法学基础工作以及可能产生先进产品的中间层。自然语言处理、UMLS本身和信息检索是纳入出版物中出现的3个最常见主题。这些结果虽然在很大程度上与学术界相关,但表明UMLS除了成功实现其预期用途外,还实现了许多超出其初衷的用途。