Suppr超能文献

利用非格状子图审核美国国立癌症研究所叙词表中的层次关系。

Leveraging Non-lattice Subgraphs to Audit Hierarchical Relations in NCI Thesaurus.

作者信息

Abeysinghe Rashmie, Brooks Michael A, Cui Licong

机构信息

School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX.

出版信息

AMIA Annu Symp Proc. 2020 Mar 4;2019:982-991. eCollection 2019.

Abstract

Auditing National Cancer Institute (NCI) thesaurus is essential to ensure that it provides accurate terminology for cancer-related clinical care as well as translational and basic research. We leverage a structural-lexical approach to identify missing hierarchical IS-A relations in NCI thesaurus based on non-lattice subgraphs and derived lexical attributes of concepts. For each concept in a non-lattice subgraph, we use two ways to derive the concept's lexical attributes: (1) inheriting lexical attributes from its ancestors within the subgraph; and (2) inheriting lexical attributes from all its ancestors. For a pair of concepts not having a hierarchical relation, if the lexical attributes of one concept is a subset of that of the other, we suggest there is a potential missing IS-A relation between the two concepts. Our approach identified 547 non-lattice subgraphs in the 19.01d release of NCI thesaurus which revealed a total of 1,022 unique potential missing IS-A relations. A random sample of 100 relations was evaluated by a domain expert. Among these relations, 90 can be obtained by the way of inheriting lexical attributes from ancestors within non-lattice subgraph, among which 76 were confirmed as valid (a precision of 84.44%); and 82 can be obtained by the way of inheriting all ancestors, among which 73 were confirmed as valid (a precision of 89.02%). The results show that our structural-lexical approach based on non-lattice subgraphs is effective for auditing NCI thesaurus.

摘要

审核美国国立癌症研究所(NCI)叙词表对于确保其为癌症相关的临床护理以及转化研究和基础研究提供准确的术语至关重要。我们利用一种结构-词汇方法,基于非格状子图和概念的派生词汇属性来识别NCI叙词表中缺失的层次化“是一种”关系。对于非格状子图中的每个概念,我们使用两种方法来派生该概念的词汇属性:(1)从子图内的祖先继承词汇属性;(2)从其所有祖先继承词汇属性。对于一对没有层次关系的概念,如果一个概念的词汇属性是另一个概念的词汇属性的子集,我们认为这两个概念之间可能存在缺失的“是一种”关系。我们的方法在NCI叙词表的19.01d版本中识别出547个非格状子图,共揭示了1022个独特的潜在缺失“是一种”关系。一位领域专家对100个关系的随机样本进行了评估。在这些关系中,90个可以通过从非格状子图内的祖先继承词汇属性的方式获得,其中76个被确认为有效(精确率为84.44%);82个可以通过继承所有祖先的方式获得,其中73个被确认为有效(精确率为89.02%)。结果表明,我们基于非格状子图的结构-词汇方法对于审核NCI叙词表是有效的。

相似文献

9
Leveraging non-lattice subgraphs for suggestion of new concepts for SNOMED CT.利用非格状子图为医学系统命名法(SNOMED CT)的新概念提供建议。
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2021 Dec;2021:1805-1812. doi: 10.1109/bibm52615.2021.9669407.

引用本文的文献

3
Identifying Missing IS-A Relations in Orphanet Rare Disease Ontology.识别《孤儿病本体论》中缺失的“属于”关系。
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2022 Dec;2022:3274-3279. doi: 10.1109/bibm55620.2022.9995614. Epub 2023 Jan 2.

本文引用的文献

8
Quality assurance of the gene ontology using abstraction networks.使用抽象网络对基因本体进行质量保证。
J Bioinform Comput Biol. 2016 Jun;14(3):1642001. doi: 10.1142/S0219720016420014. Epub 2015 Nov 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验