评估受控健康数据术语的覆盖范围：国立医学图书馆/卫生保健政策与研究局大规模词汇测试结果报告

Evaluating the coverage of controlled health data terminologies: report on the results of the NLM/AHCPR large scale vocabulary test.

作者信息

Humphreys B L, McCray A T, Cheh M L

机构信息

National Library of Medicine, Bethesda, MD 20894, USA.

出版信息

J Am Med Inform Assoc. 1997 Nov-Dec;4(6):484-500. doi: 10.1136/jamia.1997.0040484.

DOI:10.1136/jamia.1997.0040484

PMID:9391936

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC61267/

Abstract

OBJECTIVE

To determine the extent to which a combination of existing machine-readable health terminologies cover the concepts and terms needed for a comprehensive controlled vocabulary for health information systems by carrying out a distributed national experiment using the Internet and the UMLS Knowledge Sources, lexical programs, and server.

METHODS

Using a specially designed Web-based interface to the UMLS Knowledge Source Server, participants searched the more than 30 vocabularies in the 1996 UMLS Metathesaurus and three planned additions to determine if concepts for which they desired controlled terminology were present or absent. For each term submitted, the interface presented a candidate exact match or a set of potential approximate matches from which the participant selected the most closely related concept. The interface captured a profile of the terms submitted by the participant and for each term searched, information about the concept (if any) selected by the participant. The term information was loaded into a database at NLM for review and analysis and was also available to be downloaded by the participant. A team of subject experts reviewed records to identify matches missed by participants and to correct any obvious errors in relationships. The editors of SNOMED International and the Read Codes were given a random sample of reviewed terms for which exact meaning matches were not found to identify exact matches that were missed or any valid combinations of concepts that were synonymous to input terms. The 1997 UMLS Metathesaurus was used in the semantic type and vocabulary source analysis because it included most of the three planned additions.

RESULTS

Sixty-three participants submitted a total of 41,127 terms, which represented 32,679 normalized strings. More than 80% of the terms submitted were wanted for parts of the patient record related to the patient's condition. Following review, 58% of all submitted terms had exact meaning matches in the controlled vocabularies in the test, 41% had related concepts, and 1% were not found. Of the 28% of the terms which were narrower in meaning than a concept in the controlled vocabularies, 86% shared lexical items with the broader concept, but had additional modification. The percentage of exact meanings matches varied by specialty from 45% to 71%. Twenty-nine different vocabularies contained meanings for some of the 23,837 terms (a maximum of 12,707 discrete concepts) with exact meaning matches. Based on preliminary data and analysis, individual vocabularies contained < 1% to 63% of the terms and < 1% to 54% of the concepts. Only SNOMED International and the Read Codes had more than 60% of the terms and more than 50% of the concepts.

CONCLUSIONS

The combination of existing controlled vocabularies included in the test represents the meanings of the majority of the terminology needed to record patient conditions, providing substantially more exact matches than any individual vocabulary in the set. From a technical and organizational perspective, the test was successful and should serve as a useful model, both for distributed input to the enhancement of controlled vocabularies and for other kinds of collaborative informatics research.

摘要

目的

通过利用互联网以及统一医学语言系统（UMLS）知识源、词汇程序和服务器开展一项分布式全国性实验，来确定现有机器可读健康术语的组合在多大程度上涵盖健康信息系统综合受控词汇所需的概念和术语。

方法

参与者通过一个专门设计的基于网络的接口连接到UMLS知识源服务器，在1996年UMLS元词表中的30多个词汇表以及三个计划新增的词汇表中进行搜索，以确定他们所需受控术语的概念是否存在。对于提交的每个术语，该接口会呈现一个候选精确匹配项或一组潜在的近似匹配项，参与者从中选择最相关的概念。该接口会记录参与者提交的术语概况，以及针对每个搜索术语，参与者所选概念的相关信息（若有）。术语信息被加载到美国国立医学图书馆（NLM）的数据库中以供审查和分析，参与者也可下载。一组主题专家审查记录，以识别参与者遗漏的匹配项，并纠正关系中任何明显的错误。国际医学术语系统命名法（SNOMED）国际版和Read编码的编辑人员获得了一组经审查的术语的随机样本，这些术语未找到精确含义匹配项，目的是识别遗漏的精确匹配项或与输入术语同义的确切概念有效组合。1997年UMLS元词表用于语义类型和词汇源分析，因为它包含了三个计划新增词汇表中的大部分内容。

结果

63名参与者共提交了41,127个术语，代表32,679个标准化字符串。提交的术语中超过80%用于与患者病情相关的患者记录部分。经审查后，测试中所有提交术语的58%在受控词汇表中有精确含义匹配项，41%有相关概念，1%未找到。在含义比受控词汇表中的概念更窄的28%的术语中，86%与更宽泛的概念共享词汇项，但有额外的修饰。精确含义匹配的百分比因专业而异，从45%到71%不等。29个不同的词汇表包含了23,837个术语（最多12,707个离散概念）中一些术语的含义，这些术语有精确含义匹配项。根据初步数据和分析，各个词汇表包含的术语占比<1%至63%，概念占比<1%至54%。只有SNOMED国际版和Read编码包含的术语超过60%，概念超过50%。

结论

测试中包含的现有受控词汇表的组合代表了记录患者病情所需的大多数术语的含义，提供的精确匹配项比该集合中的任何单个词汇表都要多得多。从技术和组织角度来看，该测试是成功的，应作为一个有用的模型，既用于对受控词汇表增强的分布式输入，也用于其他类型的协作信息学研究。

相似文献

Evaluating the coverage of controlled health data terminologies: report on the results of the NLM/AHCPR large scale vocabulary test.

J Am Med Inform Assoc. 1997 Nov-Dec;4(6):484-500. doi: 10.1136/jamia.1997.0040484.

Planned NLM/AHCPR large-scale vocabulary test: using UMLS technology to determine the extent to which controlled vocabularies cover terminology needed for health care and public health.

J Am Med Inform Assoc. 1996 Jul-Aug;3(4):281-7. doi: 10.1136/jamia.1996.96413136.

Phase II evaluation of clinical coding schemes: completeness, taxonomy, mapping, definitions, and clarity. CPRI Work Group on Codes and Structures.

J Am Med Inform Assoc. 1997 May-Jun;4(3):238-51. doi: 10.1136/jamia.1997.0040238.

The completeness of existing lexicons for representing radiology report information.

J Digit Imaging. 2002;15 Suppl 1:201-5. doi: 10.1007/s10278-002-5046-5. Epub 2002 Mar 21.

Representation of everyday clinical nursing language in UMLS and SNOMED.

Proc AMIA Annu Fall Symp. 1996:140-4.

Department of Veterans Affairs, University of Utah consortium participation in the NLM/AHCPR Large Scale Vocabulary Test.

Proc AMIA Annu Fall Symp. 1997:565-9.

Extracting medical knowledge for a coded problem list vocabulary from the UMLS Knowledge Sources.

Proc AMIA Symp. 1998:275-9.

Representation of nursing terminology in the UMLS Metathesaurus: a pilot study.

Proc Annu Symp Comput Appl Med Care. 1992:392-6.

Conducting the NLM/AHCPR Large Scale Vocabulary Test: a distributed Internet-based experiment.

Proc AMIA Annu Fall Symp. 1997:560-4.

Dental concepts in the Unified Medical Language System.

Quintessence Int. 2002 Jan;33(1):69-74.

引用本文的文献

The U.S. National Library of Medicine and standards for electronic health records: One thing led to another.

Inf Serv Use. 2022 May 10;42(1):81-94. doi: 10.3233/ISU-210142. eCollection 2022.

Toward Interoperability: A New Resource to Support Nursing Terminology Standards.

Comput Inform Nurs. 2015 Dec;33(12):515-9. doi: 10.1097/CIN.0000000000000210.

SNOMED CT in a language isolate: an algorithm for a semiautomatic translation.

BMC Med Inform Decis Mak. 2015;15 Suppl 2(Suppl 2):S5. doi: 10.1186/1472-6947-15-S2-S5. Epub 2015 Jun 15.

Auditing the multiply-related concepts within the UMLS.

J Am Med Inform Assoc. 2014 Oct;21(e2):e185-93. doi: 10.1136/amiajnl-2013-002227. Epub 2014 Jan 24.

Guidance on Evaluating Options for Representing Clinical Data within Health Information Systems.

NI 2012 (2012). 2012 Jun 23;2012:152. eCollection 2012.

Feasibility of using Clinical Element Models (CEM) to standardize phenotype variables in the database of genotypes and phenotypes (dbGaP).

PLoS One. 2013 Sep 18;8(9):e76384. doi: 10.1371/journal.pone.0076384. eCollection 2013.

Leveraging concept-based approaches to identify potential phyto-therapies.

J Biomed Inform. 2013 Aug;46(4):602-14. doi: 10.1016/j.jbi.2013.04.008. Epub 2013 May 9.

Leveraging biodiversity knowledge for potential phyto-therapeutic applications.

J Am Med Inform Assoc. 2013 Jul-Aug;20(4):668-79. doi: 10.1136/amiajnl-2012-001445. Epub 2013 Mar 21.

Detection and characterization of usability problems in structured data entry interfaces in dentistry.

Int J Med Inform. 2013 Feb;82(2):128-38. doi: 10.1016/j.ijmedinf.2012.05.018. Epub 2012 Jun 29.

An evaluation of the UMLS in representing corpus derived clinical concepts.

AMIA Annu Symp Proc. 2011;2011:435-44. Epub 2011 Oct 22.

本文引用的文献

A clinically derived terminology: qualification to reduction.

Proc AMIA Annu Fall Symp. 1997:570-4.

Conducting the NLM/AHCPR Large Scale Vocabulary Test: a distributed Internet-based experiment.

Proc AMIA Annu Fall Symp. 1997:560-4.

Call for a standard clinical vocabulary.

J Am Med Inform Assoc. 1997 May-Jun;4(3):254-5. doi: 10.1136/jamia.1997.0040254.

Phase II evaluation of clinical coding schemes: completeness, taxonomy, mapping, definitions, and clarity. CPRI Work Group on Codes and Structures.

J Am Med Inform Assoc. 1997 May-Jun;4(3):238-51. doi: 10.1136/jamia.1997.0040238.

Read Codes Version 3: a user led terminology.

Methods Inf Med. 1995 Mar;34(1-2):187-92.

The UMLS Knowledge Source Server: a versatile Internet-based research tool.

Proc AMIA Annu Fall Symp. 1996:164-8.

The efficacy of SNOMED, Read Codes, and UMLS in coding ambulatory family practice clinical records.

Proc AMIA Annu Fall Symp. 1996:135-9.

Planned NLM/AHCPR large-scale vocabulary test: using UMLS technology to determine the extent to which controlled vocabularies cover terminology needed for health care and public health.

J Am Med Inform Assoc. 1996 Jul-Aug;3(4):281-7. doi: 10.1136/jamia.1996.96413136.

The content coverage of clinical classifications. For The Computer-Based Patient Record Institute's Work Group on Codes & Structures.

J Am Med Inform Assoc. 1996 May-Jun;3(3):224-33. doi: 10.1136/jamia.1996.96310636.

Lexical methods for managing variation in biomedical terminologies.

Proc Annu Symp Comput Appl Med Care. 1994:235-9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

评估受控健康数据术语的覆盖范围：国立医学图书馆/卫生保健政策与研究局大规模词汇测试结果报告

Evaluating the coverage of controlled health data terminologies: report on the results of the NLM/AHCPR large scale vocabulary test.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

评估受控健康数据术语的覆盖范围：国立医学图书馆/卫生保健政策与研究局大规模词汇测试结果报告

Evaluating the coverage of controlled health data terminologies: report on the results of the NLM/AHCPR large scale vocabulary test.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献