• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于词汇粒度和机器学习的本体术语自动构建:算法开发与验证

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation.

作者信息

Luo Lingyun, Feng Jingtao, Yu Huijun, Wang Jiaolong

机构信息

School of Computer Science, University of South China, Hengyang, China.

Hunan Medical Big Data International Science and Technology Innovation Cooperation Base, Hengyang, China.

出版信息

JMIR Med Inform. 2020 Nov 25;8(11):e22333. doi: 10.2196/22333.

DOI:10.2196/22333
PMID:33127601
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7725650/
Abstract

BACKGROUND

As the manual creation and maintenance of biomedical ontologies are labor-intensive, automatic aids are desirable in the lifecycle of ontology development.

OBJECTIVE

Provided with a set of concept names in the Foundational Model of Anatomy (FMA), we propose an innovative method for automatically generating the taxonomy and the partonomy structures among them, respectively.

METHODS

Our approach comprises 2 main tasks: The first task is predicting the direct relation between 2 given concept names by utilizing word embedding methods and training 2 machine learning models, Convolutional Neural Networks (CNN) and Bidirectional Long Short-term Memory Networks (Bi-LSTM). The second task is the introduction of an original granularity-based method to identify the semantic structures among a group of given concept names by leveraging these trained models.

RESULTS

Results show that both CNN and Bi-LSTM perform well on the first task, with F1 measures above 0.91. For the second task, our approach achieves an average F1 measure of 0.79 on 100 case studies in the FMA using Bi-LSTM, which outperforms the primitive pairwise-based method.

CONCLUSIONS

We have investigated an automatic way of predicting a hierarchical relationship between 2 concept names; based on this, we have further invented a methodology to structure a group of concept names automatically. This study is an initial investigation that will shed light on further work on the automatic creation and enrichment of biomedical ontologies.

摘要

背景

由于生物医学本体的手动创建和维护需要耗费大量人力,因此在本体开发的生命周期中需要自动化辅助工具。

目的

给定一组解剖学基础模型(FMA)中的概念名称,我们提出一种创新方法,分别自动生成它们之间的分类法和部分-整体结构。

方法

我们的方法包括2个主要任务:第一个任务是通过利用词嵌入方法并训练2个机器学习模型,即卷积神经网络(CNN)和双向长短期记忆网络(Bi-LSTM),来预测2个给定概念名称之间的直接关系。第二个任务是引入一种基于粒度的原始方法,通过利用这些训练好的模型来识别一组给定概念名称之间的语义结构。

结果

结果表明,CNN和Bi-LSTM在第一个任务上均表现良好,F1值均高于0.91。对于第二个任务,我们的方法在FMA的100个案例研究中使用Bi-LSTM实现了平均F1值为0.79,优于基于原始成对方法。

结论

我们研究了一种预测2个概念名称之间层次关系的自动方法;基于此,我们进一步发明了一种自动构建一组概念名称结构的方法。本研究是一项初步调查,将为生物医学本体的自动创建和丰富的进一步工作提供启示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/2124d10d8eb0/medinform_v8i11e22333_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/42692d3ee8a2/medinform_v8i11e22333_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/4cb8b8ffc0ff/medinform_v8i11e22333_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/fcbd347b3bac/medinform_v8i11e22333_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/842f351ec654/medinform_v8i11e22333_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/2124d10d8eb0/medinform_v8i11e22333_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/42692d3ee8a2/medinform_v8i11e22333_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/4cb8b8ffc0ff/medinform_v8i11e22333_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/fcbd347b3bac/medinform_v8i11e22333_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/842f351ec654/medinform_v8i11e22333_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c68b/7725650/2124d10d8eb0/medinform_v8i11e22333_fig5.jpg

相似文献

1
Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation.基于词汇粒度和机器学习的本体术语自动构建:算法开发与验证
JMIR Med Inform. 2020 Nov 25;8(11):e22333. doi: 10.2196/22333.
2
Dissecting the Ambiguity of FMA Concept Names Using Taxonomy and Partonomy Structural Information.利用分类法和部分-整体结构信息剖析FMA概念名称的模糊性
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:157-61. eCollection 2013.
3
Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning.使用本体引导的机器学习识别医学文本中的临床术语。
JMIR Med Inform. 2019 May 10;7(2):e12596. doi: 10.2196/12596.
4
Evaluating the granularity balance of hierarchical relationships within large biomedical terminologies towards quality improvement.评估大型生物医学术语中层次关系的粒度平衡,以提高质量。
J Biomed Inform. 2017 Nov;75:129-137. doi: 10.1016/j.jbi.2017.10.001. Epub 2017 Oct 4.
5
Detection of Algorithmically Generated Domain Names Using the Recurrent Convolutional Neural Network with Spatial Pyramid Pooling.使用带有空间金字塔池化的递归卷积神经网络检测算法生成的域名。
Entropy (Basel). 2020 Sep 22;22(9):1058. doi: 10.3390/e22091058.
6
Linked open data-based framework for automatic biomedical ontology generation.基于链接开放数据的自动生物医学本体生成框架。
BMC Bioinformatics. 2018 Sep 10;19(1):319. doi: 10.1186/s12859-018-2339-3.
7
Matching Biomedical Ontologies: Construction of Matching Clues and Systematic Evaluation of Different Combinations of Matchers.匹配生物医学本体:匹配线索的构建及匹配器不同组合的系统评估
JMIR Med Inform. 2021 Aug 19;9(8):e28212. doi: 10.2196/28212.
8
Combining lexical and context features for automatic ontology extension.基于词汇和上下文特征的本体自动扩展。
J Biomed Semantics. 2020 Jan 13;11(1):1. doi: 10.1186/s13326-019-0218-0.
9
Temporal indexing of medical entity in Chinese clinical notes.中文临床记录中医疗实体的时间索引。
BMC Med Inform Decis Mak. 2019 Jan 31;19(Suppl 1):17. doi: 10.1186/s12911-019-0735-x.
10
HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology.HPO2Vec+:利用异构知识资源丰富人类表型本体的节点嵌入。
J Biomed Inform. 2019 Aug;96:103246. doi: 10.1016/j.jbi.2019.103246. Epub 2019 Jun 27.

引用本文的文献

1
Active Learning Pipeline to Identify Candidate Terms for a CDSS Ontology.主动学习管道,用于识别 CDSS 本体候选术语。
Stud Health Technol Inform. 2024 Aug 22;316:1338-1342. doi: 10.3233/SHTI240660.
2
Tourism-type ontology framework for tourism-type classification, naming, and knowledge organization.用于旅游类型分类、命名和知识组织的旅游类型本体框架。
Heliyon. 2023 Apr 5;9(4):e15192. doi: 10.1016/j.heliyon.2023.e15192. eCollection 2023 Apr.
3
An evidence-based lexical pattern approach for quality assurance of Gene Ontology relations.

本文引用的文献

1
Transfer Learning from BERT to Support Insertion of New Concepts into SNOMED CT.从BERT进行迁移学习以支持将新概念插入SNOMED CT。
AMIA Annu Symp Proc. 2020 Mar 4;2019:1129-1138. eCollection 2019.
2
Training a Convolutional Neural Network with Terminology Summarization Data Improves SNOMED CT Enrichment.使用术语摘要数据训练卷积神经网络可改善SNOMED CT术语丰富度。
AMIA Annu Symp Proc. 2020 Mar 4;2019:972-981. eCollection 2019.
3
Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data.从海量多模态医学数据中学习的临床概念嵌入。
基于证据的词汇模式方法,用于保证基因本体论关系的质量。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac122.
Pac Symp Biocomput. 2020;25:295-306.
4
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
5
Using Convolutional Neural Networks to Support Insertion of New Concepts into SNOMED CT.使用卷积神经网络支持将新概念插入医学系统命名法临床术语(SNOMED CT)。
AMIA Annu Symp Proc. 2018 Dec 5;2018:750-759. eCollection 2018.
6
Evaluating the granularity balance of hierarchical relationships within large biomedical terminologies towards quality improvement.评估大型生物医学术语中层次关系的粒度平衡,以提高质量。
J Biomed Inform. 2017 Nov;75:129-137. doi: 10.1016/j.jbi.2017.10.001. Epub 2017 Oct 4.
7
Dissecting the Ambiguity of FMA Concept Names Using Taxonomy and Partonomy Structural Information.利用分类法和部分-整体结构信息剖析FMA概念名称的模糊性
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:157-61. eCollection 2013.
8
An analysis of FMA using structural self-bisimilarity.利用结构自相似性分析 FMA。
J Biomed Inform. 2013 Jun;46(3):497-505. doi: 10.1016/j.jbi.2013.03.005. Epub 2013 Apr 2.
9
Natural Language Processing methods and systems for biomedical ontology learning.自然语言处理方法和系统在生物医学本体学习中的应用。
J Biomed Inform. 2011 Feb;44(1):163-79. doi: 10.1016/j.jbi.2010.07.006. Epub 2010 Jul 18.
10
Survey-based naming conventions for use in OBO Foundry ontology development.用于OBO铸造厂本体开发的基于调查的命名约定。
BMC Bioinformatics. 2009 Apr 27;10:125. doi: 10.1186/1471-2105-10-125.