• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

COS:一种新的包含语料库、本体和语义谓词的 MeSH 术语嵌入方法。

COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications.

机构信息

Department of Computer Science and Engineering, University of North Texas, Denton, Texas, United States of America.

出版信息

PLoS One. 2021 May 4;16(5):e0251094. doi: 10.1371/journal.pone.0251094. eCollection 2021.

DOI:10.1371/journal.pone.0251094
PMID:33945566
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8096083/
Abstract

The embedding of Medical Subject Headings (MeSH) terms has become a foundation for many downstream bioinformatics tasks. Recent studies employ different data sources, such as the corpus (in which each document is indexed by a set of MeSH terms), the MeSH term ontology, and the semantic predications between MeSH terms (extracted by SemMedDB), to learn their embeddings. While these data sources contribute to learning the MeSH term embeddings, current approaches fail to incorporate all of them in the learning process. The challenge is that the structured relationships between MeSH terms are different across the data sources, and there is no approach to fusing such complex data into the MeSH term embedding learning. In this paper, we study the problem of incorporating corpus, ontology, and semantic predications to learn the embeddings of MeSH terms. We propose a novel framework, Corpus, Ontology, and Semantic predications-based MeSH term embedding (COS), to generate high-quality MeSH term embeddings. COS converts the corpus, ontology, and semantic predications into MeSH term sequences, merges these sequences, and learns MeSH term embeddings using the sequences. Extensive experiments on different datasets show that COS outperforms various baseline embeddings and traditional non-embedding-based baselines.

摘要

医学主题词 (MeSH) 项的嵌入已经成为许多下游生物信息学任务的基础。最近的研究使用不同的数据源,如语料库(其中每个文档都由一组 MeSH 术语索引)、MeSH 术语本体和 MeSH 术语之间的语义谓词(由 SemMedDB 提取),来学习它们的嵌入。虽然这些数据源有助于学习 MeSH 术语嵌入,但目前的方法未能在学习过程中全部利用它们。挑战在于 MeSH 术语之间的结构化关系在不同的数据源中是不同的,并且没有方法将如此复杂的数据融合到 MeSH 术语嵌入学习中。在本文中,我们研究了将语料库、本体和语义谓词结合起来学习 MeSH 术语嵌入的问题。我们提出了一种新颖的框架,即基于语料库、本体和语义谓词的 MeSH 术语嵌入(COS),以生成高质量的 MeSH 术语嵌入。COS 将语料库、本体和语义谓词转换为 MeSH 术语序列,合并这些序列,并使用这些序列学习 MeSH 术语嵌入。在不同数据集上的广泛实验表明,COS 优于各种基线嵌入和传统的非嵌入基线。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd3/8096083/70e804d9d990/pone.0251094.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd3/8096083/f736325ee38d/pone.0251094.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd3/8096083/70e804d9d990/pone.0251094.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd3/8096083/f736325ee38d/pone.0251094.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd3/8096083/70e804d9d990/pone.0251094.g002.jpg

相似文献

1
COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications.COS:一种新的包含语料库、本体和语义谓词的 MeSH 术语嵌入方法。
PLoS One. 2021 May 4;16(5):e0251094. doi: 10.1371/journal.pone.0251094. eCollection 2021.
2
Multi-Ontology Refined Embeddings (MORE): A hybrid multi-ontology and corpus-based semantic representation model for biomedical concepts.多本体精炼嵌入模型(MORE):一种基于混合多本体和语料库的生物医学概念语义表示模型。
J Biomed Inform. 2020 Nov;111:103581. doi: 10.1016/j.jbi.2020.103581. Epub 2020 Oct 1.
3
Comment on 'MeSH-up: effective MeSH text classification for improved document retrieval'.评“MeSH-up:有效 MeSH 文本分类,提高文献检索效果”。
Bioinformatics. 2009 Oct 15;25(20):2770-1; author reply 2772. doi: 10.1093/bioinformatics/btp483. Epub 2009 Aug 11.
4
Embedding of semantic predications.语义谓词的嵌入
J Biomed Inform. 2017 Apr;68:150-166. doi: 10.1016/j.jbi.2017.03.003. Epub 2017 Mar 8.
5
Predication-based semantic indexing: permutations as a means to encode predications in semantic space.基于谓词的语义索引:排列作为在语义空间中编码谓词的一种手段。
AMIA Annu Symp Proc. 2009 Nov 14;2009:114-8.
6
Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts.使用词和图嵌入来衡量统一医学语言系统概念之间的语义相关性。
J Am Med Inform Assoc. 2020 Oct 1;27(10):1538-1546. doi: 10.1093/jamia/ocaa136.
7
Cross-language MeSH indexing using morpho-semantic normalization.使用形态语义归一化的跨语言医学主题词表索引编制
AMIA Annu Symp Proc. 2003;2003:425-9.
8
Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy.利用本体和元数据进行生物医学词义消歧:自动化与准确性的结合。
BMC Bioinformatics. 2009 Jan 21;10:28. doi: 10.1186/1471-2105-10-28.
9
Matching biomedical ontologies with GCN-based feature propagation.基于图卷积网络特征传播的生物医学本体匹配。
Math Biosci Eng. 2022 Jun 9;19(8):8479-8504. doi: 10.3934/mbe.2022394.
10
Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases.利用生物医学和一般领域知识库评估神经词汇嵌入中的语义关系。
BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):65. doi: 10.1186/s12911-018-0630-x.

本文引用的文献

1
Drug repurposing for COVID-19 via knowledge graph completion.基于知识图谱补全的新冠病毒药物再利用
J Biomed Inform. 2021 Mar;115:103696. doi: 10.1016/j.jbi.2021.103696. Epub 2021 Feb 8.
2
MeSHHeading2vec: a new method for representing MeSH headings as vectors based on graph embedding algorithm.MeSHHeading2vec:一种基于图嵌入算法的将 MeSH 标题表示为向量的新方法。
Brief Bioinform. 2021 Mar 22;22(2):2085-2095. doi: 10.1093/bib/bbaa037.
3
BioWordVec, improving biomedical word embeddings with subword information and MeSH.BioWordVec,利用子词信息和 MeSH 改进生物医学词向量。
Sci Data. 2019 May 10;6(1):52. doi: 10.1038/s41597-019-0055-0.
4
Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches.神经网络在真实生物医学图中的链接预测:基于图嵌入方法的多维评估。
BMC Bioinformatics. 2018 May 21;19(1):176. doi: 10.1186/s12859-018-2163-9.
5
node2vec: Scalable Feature Learning for Networks.节点2向量:网络的可扩展特征学习
KDD. 2016 Aug;2016:855-864. doi: 10.1145/2939672.2939754.
6
DeepMeSH: deep semantic representation for improving large-scale MeSH indexing.深度医学主题词表:用于改进大规模医学主题词表索引的深度语义表示。
Bioinformatics. 2016 Jun 15;32(12):i70-i79. doi: 10.1093/bioinformatics/btw294.
7
SemMedDB: a PubMed-scale repository of biomedical semantic predications.SemMedDB:一个基于 PubMed 规模的生物医学语义断言知识库。
Bioinformatics. 2012 Dec 1;28(23):3158-60. doi: 10.1093/bioinformatics/bts591. Epub 2012 Oct 8.
8
Constructing a semantic predication gold standard from the biomedical literature.从生物医学文献中构建语义谓词黄金标准。
BMC Bioinformatics. 2011 Dec 20;12:486. doi: 10.1186/1471-2105-12-486.
9
MeSH: a window into full text for document summarization.MeSH:全文检索的窗口,用于文档摘要。
Bioinformatics. 2011 Jul 1;27(13):i120-8. doi: 10.1093/bioinformatics/btr223.