Suppr超能文献

使用SNOMED CT语义标签验证统一医学语言系统(UMLS)语义类型分配

Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags.

作者信息

Gu Huanying, He Zhe, Wei Duo, Elhanan Gai, Chen Yan

出版信息

Methods Inf Med. 2018 Feb;57(1):43-53. doi: 10.3414/ME17-01-0120. Epub 2018 Apr 5.

Abstract

BACKGROUND

The UMLS assigns semantic types to all its integrated concepts. The semantic types are widely used in various natural language processing tasks in the biomedical domain, such as named entity recognition, semantic disambiguation, and semantic annotation. Due to the size of the UMLS, erroneous semantic type assignments are hard to detect. It is imperative to devise automated techniques to identify errors and inconsistencies in semantic type assignments.

OBJECTIVES

Designing a methodology to perform programmatic checks to detect semantic type assignment errors for UMLS concepts with one or more SNOMED CT terms and evaluating concepts in a selected set of SNOMED CT hierarchies to verify our hypothesis that UMLS semantic type assignment errors may exist in concepts residing in semantically inconsistent groups.

METHODS

Our methodology is a four-stage process. 1) partitioning concepts in a SNOMED CT hierarchy into semantically uniform groups based on their assigned semantic tags; 2) partitioning concepts in each group from 1) into the disjoint sub-groups based on their semantic type assignments; 3) mapping all SNOMED CT semantic tags into one or more semantic types in the UMLS; 4) identifying semantically inconsistent groups that have inconsistent assignments between semantic tags and semantic types according to the mapping from 3) and providing concepts in such groups to the domain experts for reviewing.

RESULTS

We applied our method on the UMLS 2013AA release. Concepts of the semantically inconsistent groups in the PHYSICAL FORCE and RECORD ARTIFACT hierarchies have error rates 33% and 62.5% respectively, which are greatly larger than error rates 0.6% and 1% in semantically consistent groups of the two hierarchies.

CONCLUSION

Concepts in semantically in - consistent groups are more likely to contain semantic type assignment errors. Our methodology can make auditing more efficient by limiting auditing resources on concepts of semantically inconsistent groups.

摘要

背景

统一医学语言系统(UMLS)为其所有整合的概念赋予语义类型。这些语义类型在生物医学领域的各种自然语言处理任务中被广泛使用,如命名实体识别、语义消歧和语义标注。由于UMLS的规模庞大,错误的语义类型分配很难被检测到。因此,设计自动化技术来识别语义类型分配中的错误和不一致性势在必行。

目的

设计一种方法,通过编程检查来检测带有一个或多个SNOMED CT术语的UMLS概念的语义类型分配错误,并在一组选定的SNOMED CT层次结构中评估概念,以验证我们的假设,即UMLS语义类型分配错误可能存在于语义不一致组中的概念中。

方法

我们的方法包括四个阶段。1)根据分配的语义标签,将SNOMED CT层次结构中的概念划分为语义统一的组;2)根据每个组中概念的语义类型分配,将1)中的每个组划分为不相交的子组;3)将所有SNOMED CT语义标签映射到UMLS中的一个或多个语义类型;4)根据3)中的映射,识别语义标签和语义类型之间分配不一致的语义不一致组,并将这些组中的概念提供给领域专家进行审查。

结果

我们将方法应用于UMLS 2013AA版本。物理力和记录制品层次结构中语义不一致组的概念错误率分别为33%和62.5%,远高于这两个层次结构中语义一致组的0.6%和1%的错误率。

结论

语义不一致组中的概念更有可能包含语义类型分配错误。我们的方法可以通过将审核资源限制在语义不一致组的概念上,提高审核效率。

相似文献

1
Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags.
Methods Inf Med. 2018 Feb;57(1):43-53. doi: 10.3414/ME17-01-0120. Epub 2018 Apr 5.
2
Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies.
Methods Inf Med. 2016;55(2):158-65. doi: 10.3414/ME14-01-0104. Epub 2015 Apr 30.
3
Sculpting the UMLS Refined Semantic Network.
Online J Public Health Inform. 2014 Oct 16;6(2):e181. doi: 10.5210/ojphi.v6i2.5412. eCollection 2014.
4
A comparative analysis of the density of the SNOMED CT conceptual content for semantic harmonization.
Artif Intell Med. 2015 May;64(1):29-40. doi: 10.1016/j.artmed.2015.03.002. Epub 2015 Apr 2.
5
Auditing the Assignments of Top-Level Semantic Types in the UMLS Semantic Network to UMLS Concepts.
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Nov;2017:1262-1269. doi: 10.1109/BIBM.2017.8217840. Epub 2017 Dec 18.
6
Structural group auditing of a UMLS semantic type's extent.
J Biomed Inform. 2009 Feb;42(1):41-52. doi: 10.1016/j.jbi.2008.06.001. Epub 2008 Jun 17.
8
10
Mismatches between major subhierarchies and semantic tags in SNOMED CT.
J Biomed Inform. 2018 May;81:1-15. doi: 10.1016/j.jbi.2018.02.009. Epub 2018 Feb 17.

引用本文的文献

2
Two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning.
J Am Med Inform Assoc. 2023 Nov 17;30(12):1887-1894. doi: 10.1093/jamia/ocad152.
3
A review of auditing techniques for the Unified Medical Language System.
J Am Med Inform Assoc. 2020 Oct 1;27(10):1625-1638. doi: 10.1093/jamia/ocaa108.

本文引用的文献

1
Semantic Modeling for Exposomics with Exploratory Evaluation in Clinical Context.
J Healthc Eng. 2017;2017:3818302. doi: 10.1155/2017/3818302. Epub 2017 Aug 30.
2
Enriching consumer health vocabulary through mining a social Q&A site: A similarity-based approach.
J Biomed Inform. 2017 May;69:75-85. doi: 10.1016/j.jbi.2017.03.016. Epub 2017 Mar 27.
3
Analyzing SNOMED CT's Historical Data: Pitfalls and Possibilities.
AMIA Annu Symp Proc. 2017 Feb 10;2016:361-370. eCollection 2016.
5
Automated learning of domain taxonomies from text using background knowledge.
J Biomed Inform. 2016 Oct;63:295-306. doi: 10.1016/j.jbi.2016.09.002. Epub 2016 Sep 3.
6
Semantic biomedical resource discovery: a Natural Language Processing framework.
BMC Med Inform Decis Mak. 2015 Sep 30;15:77. doi: 10.1186/s12911-015-0200-4.
7
Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies.
Methods Inf Med. 2016;55(2):158-65. doi: 10.3414/ME14-01-0104. Epub 2015 Apr 30.
8
Longitudinal analysis of new information types in clinical notes.
AMIA Jt Summits Transl Sci Proc. 2014 Apr 7;2014:232-7. eCollection 2014.
9
SimQ: real-time retrieval of similar consumer health questions.
J Med Internet Res. 2015 Feb 17;17(2):e43. doi: 10.2196/jmir.3388.
10
Sculpting the UMLS Refined Semantic Network.
Online J Public Health Inform. 2014 Oct 16;6(2):e181. doi: 10.5210/ojphi.v6i2.5412. eCollection 2014.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验