Suppr超能文献

利用获取的术语对识别疫苗本体中缺失的层次关系。

Identification of missing hierarchical relations in the vaccine ontology using acquired term pairs.

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, USA.

出版信息

J Biomed Semantics. 2022 Aug 13;13(1):22. doi: 10.1186/s13326-022-00276-2.

Abstract

BACKGROUND

The Vaccine Ontology (VO) is a biomedical ontology that standardizes vaccine annotation. Errors in VO will affect a multitude of applications that it is being used in. Quality assurance of VO is imperative to ensure that it provides accurate domain knowledge to these downstream tasks. Manual review to identify and fix quality issues (such as missing hierarchical is-a relations) is challenging given the complexity of the ontology. Automated approaches are highly desirable to facilitate the quality assurance of VO.

METHODS

We developed an automated lexical approach that identifies potentially missing is-a relations in VO. First, we construct two types of VO concept-pairs: (1) linked; and (2) unlinked. Each concept-pair further derives an Acquired Term Pair (ATP) based on their lexical features. If the same ATP is obtained by a linked concept-pair and an unlinked concept-pair, this is considered to indicate a potentially missing is-a relation between the unlinked pair of concepts.

RESULTS

Applying this approach on the 1.1.192 version of VO, we were able to identify 232 potentially missing is-a relations. A manual review by a VO domain expert on a random sample of 70 potentially missing is-a relations revealed that 65 of the cases were valid missing is-a relations in VO (a precision of 92.86%).

CONCLUSIONS

The results indicate that our approach is highly effective in identifying missing is-a relation in VO.

摘要

背景

疫苗本体(VO)是一个标准化疫苗注释的生物医学本体。VO 中的错误将影响到许多正在使用它的应用程序。为了确保它为这些下游任务提供准确的领域知识,对 VO 进行质量保证是至关重要的。鉴于本体的复杂性,手动审查以识别和修复质量问题(例如缺失层次结构的 is-a 关系)具有挑战性。需要自动化方法来促进 VO 的质量保证。

方法

我们开发了一种自动词汇方法,用于识别 VO 中潜在缺失的 is-a 关系。首先,我们构建了两种类型的 VO 概念对:(1)链接;(2)未链接。每个概念对进一步根据其词汇特征派生一个获得的术语对(ATP)。如果同一个 ATP 是由链接的概念对和未链接的概念对获得的,这被认为表明未链接的概念对之间存在潜在缺失的 is-a 关系。

结果

将此方法应用于 VO 的 1.1.192 版本,我们能够识别出 232 个潜在缺失的 is-a 关系。VO 领域专家对 70 个潜在缺失的 is-a 关系的随机样本进行了手动审查,结果表明 65 个案例是 VO 中有效的缺失的 is-a 关系(准确率为 92.86%)。

结论

结果表明,我们的方法在识别 VO 中的缺失的 is-a 关系方面非常有效。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d25/9375268/7a18366dc413/13326_2022_276_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验