基于词汇粒度和机器学习的本体术语自动构建：算法开发与验证

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation.

作者信息

Luo Lingyun, Feng Jingtao, Yu Huijun, Wang Jiaolong

机构信息

School of Computer Science, University of South China, Hengyang, China.

Hunan Medical Big Data International Science and Technology Innovation Cooperation Base, Hengyang, China.

出版信息

JMIR Med Inform. 2020 Nov 25;8(11):e22333. doi: 10.2196/22333.

DOI:10.2196/22333

PMID:33127601

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7725650/

Abstract

BACKGROUND

As the manual creation and maintenance of biomedical ontologies are labor-intensive, automatic aids are desirable in the lifecycle of ontology development.

OBJECTIVE

Provided with a set of concept names in the Foundational Model of Anatomy (FMA), we propose an innovative method for automatically generating the taxonomy and the partonomy structures among them, respectively.

METHODS

Our approach comprises 2 main tasks: The first task is predicting the direct relation between 2 given concept names by utilizing word embedding methods and training 2 machine learning models, Convolutional Neural Networks (CNN) and Bidirectional Long Short-term Memory Networks (Bi-LSTM). The second task is the introduction of an original granularity-based method to identify the semantic structures among a group of given concept names by leveraging these trained models.

RESULTS

Results show that both CNN and Bi-LSTM perform well on the first task, with F1 measures above 0.91. For the second task, our approach achieves an average F1 measure of 0.79 on 100 case studies in the FMA using Bi-LSTM, which outperforms the primitive pairwise-based method.

CONCLUSIONS

We have investigated an automatic way of predicting a hierarchical relationship between 2 concept names; based on this, we have further invented a methodology to structure a group of concept names automatically. This study is an initial investigation that will shed light on further work on the automatic creation and enrichment of biomedical ontologies.

摘要

背景

由于生物医学本体的手动创建和维护需要耗费大量人力，因此在本体开发的生命周期中需要自动化辅助工具。

目的

给定一组解剖学基础模型（FMA）中的概念名称，我们提出一种创新方法，分别自动生成它们之间的分类法和部分-整体结构。

方法

我们的方法包括2个主要任务：第一个任务是通过利用词嵌入方法并训练2个机器学习模型，即卷积神经网络（CNN）和双向长短期记忆网络（Bi-LSTM），来预测2个给定概念名称之间的直接关系。第二个任务是引入一种基于粒度的原始方法，通过利用这些训练好的模型来识别一组给定概念名称之间的语义结构。

结果

结果表明，CNN和Bi-LSTM在第一个任务上均表现良好，F1值均高于0.91。对于第二个任务，我们的方法在FMA的100个案例研究中使用Bi-LSTM实现了平均F1值为0.79，优于基于原始成对方法。

结论

我们研究了一种预测2个概念名称之间层次关系的自动方法；基于此，我们进一步发明了一种自动构建一组概念名称结构的方法。本研究是一项初步调查，将为生物医学本体的自动创建和丰富的进一步工作提供启示。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于词汇粒度和机器学习的本体术语自动构建：算法开发与验证

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

基于词汇粒度和机器学习的本体术语自动构建：算法开发与验证

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献