• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物医学知识图谱嵌入的基准和最佳实践

Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.

作者信息

Chang David, Balažević Ivana, Allen Carl, Chawla Daniel, Brandt Cynthia, Taylor Richard Andrew

机构信息

Yale Center for Medical Informatics, Yale University.

School of Informatics, University of Edinburgh, UK.

出版信息

Proc Conf Assoc Comput Linguist Meet. 2020 Jul;2020:167-176. doi: 10.18653/v1/2020.bionlp-1.18.

DOI:10.18653/v1/2020.bionlp-1.18
PMID:33746351
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7971091/
Abstract

Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.

摘要

许多生物医学和医疗保健数据都以离散的符号形式编码,如文本和医学代码。知识库和本体中存储了大量由专家精心策划的生物医学领域知识,但缺乏可靠的知识表示学习方法限制了它们在机器学习应用中的效用。虽然近年来基于文本的表示学习通过自然语言处理的进展有了显著改进,但到目前为止,学习生物医学概念嵌入的尝试仍很缺乏。最近一类称为知识图谱嵌入的模型在通用领域知识图谱上显示出了有前景的结果,我们探索它们在生物医学领域的能力。我们在SNOMED-CT知识图谱上训练了几个最先进的知识图谱嵌入模型,提供了与现有方法比较的基准以及关于最佳实践的深入讨论,并论证了利用知识图谱的多关系性质来学习生物医学知识表示的重要性。这些嵌入、代码和材料将提供给社区。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd66/7971091/786fdc8db70e/nihms-1676481-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd66/7971091/43850ceab86c/nihms-1676481-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd66/7971091/786fdc8db70e/nihms-1676481-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd66/7971091/43850ceab86c/nihms-1676481-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd66/7971091/786fdc8db70e/nihms-1676481-f0002.jpg

相似文献

1
Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.生物医学知识图谱嵌入的基准和最佳实践
Proc Conf Assoc Comput Linguist Meet. 2020 Jul;2020:167-176. doi: 10.18653/v1/2020.bionlp-1.18.
2
Application and evaluation of knowledge graph embeddings in biomedical data.知识图谱嵌入技术在生物医学数据中的应用与评估
PeerJ Comput Sci. 2021 Feb 18;7:e341. doi: 10.7717/peerj-cs.341. eCollection 2021.
3
FuseLinker: Leveraging LLM's pre-trained text embeddings and domain knowledge to enhance GNN-based link prediction on biomedical knowledge graphs.FuseLinker:利用大语言模型的预训练文本嵌入和领域知识增强基于图神经网络的生物医学知识图谱的链接预测。
J Biomed Inform. 2024 Oct;158:104730. doi: 10.1016/j.jbi.2024.104730. Epub 2024 Sep 24.
4
Survey on graph embeddings and their applications to machine learning problems on graphs.关于图嵌入及其在图上机器学习问题中的应用的综述。
PeerJ Comput Sci. 2021 Feb 4;7:e357. doi: 10.7717/peerj-cs.357. eCollection 2021.
5
Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization.多视图不完整知识图集成及其在跨机构电子健康记录数据协调中的应用。
J Biomed Inform. 2022 Sep;133:104147. doi: 10.1016/j.jbi.2022.104147. Epub 2022 Jul 21.
6
Adverse Drug Event Prediction Using Noisy Literature-Derived Knowledge Graphs: Algorithm Development and Validation.使用有噪声的文献衍生知识图谱进行药物不良事件预测:算法开发与验证
JMIR Med Inform. 2021 Oct 25;9(10):e32730. doi: 10.2196/32730.
7
Knowledge Graph Embeddings for ICU readmission prediction.知识图嵌入在 ICU 再入院预测中的应用。
BMC Med Inform Decis Mak. 2023 Jan 19;23(1):12. doi: 10.1186/s12911-022-02070-7.
8
Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts.使用词和图嵌入来衡量统一医学语言系统概念之间的语义相关性。
J Am Med Inform Assoc. 2020 Oct 1;27(10):1538-1546. doi: 10.1093/jamia/ocaa136.
9
Multi-domain knowledge graph embeddings for gene-disease association prediction.多领域知识图谱嵌入在基因-疾病关联预测中的应用。
J Biomed Semantics. 2023 Aug 14;14(1):11. doi: 10.1186/s13326-023-00291-x.
10
Text-Graph Enhanced Knowledge Graph Representation Learning.文本-图增强的知识图谱表示学习
Front Artif Intell. 2021 Aug 17;4:697856. doi: 10.3389/frai.2021.697856. eCollection 2021.

引用本文的文献

1
Predicting Natural Product-Drug Interactions with Knowledge Graph Embeddings.利用知识图谱嵌入预测天然产物与药物的相互作用。
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:556-565. eCollection 2025.
2
Gene expression knowledge graph for patient representation and diabetes prediction.用于患者表征和糖尿病预测的基因表达知识图谱。
J Biomed Semantics. 2025 Mar 8;16(1):2. doi: 10.1186/s13326-025-00325-6.
3
Development of a Knowledge Graph Embeddings Model for Pain.疼痛知识图谱嵌入模型的开发。

本文引用的文献

1
Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data.从海量多模态医学数据中学习的临床概念嵌入。
Pac Symp Biocomput. 2020;25:295-306.
2
The Gene Ontology Resource: 20 years and still GOing strong.《基因本体论资源:20 年,持续强大》
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338. doi: 10.1093/nar/gky1055.
3
node2vec: Scalable Feature Learning for Networks.节点2向量:网络的可扩展特征学习
AMIA Annu Symp Proc. 2024 Jan 11;2023:299-308. eCollection 2023.
4
A deep learning approach to identify missing is-a relations in SNOMED CT.一种用于识别 SNOMED CT 中缺失的 is-a 关系的深度学习方法。
J Am Med Inform Assoc. 2023 Feb 16;30(3):475-484. doi: 10.1093/jamia/ocac248.
5
Ensembles of knowledge graph embedding models improve predictions for drug discovery.知识图嵌入模型的集合提高了药物发现的预测能力。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac481.
6
Incorporating Domain Knowledge Into Language Models by Using Graph Convolutional Networks for Assessing Semantic Textual Similarity: Model Development and Performance Comparison.通过使用图卷积网络将领域知识融入语言模型以评估语义文本相似度:模型开发与性能比较
JMIR Med Inform. 2021 Nov 26;9(11):e23101. doi: 10.2196/23101.
7
Analysis of Health Trajectories Leading to Adverse Opioid-Related Events.导致不良阿片类药物相关事件的健康轨迹分析。
AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:248-256. eCollection 2021.
8
Early identification of patients with acute gastrointestinal bleeding using natural language processing and decision rules.利用自然语言处理和决策规则早期识别急性胃肠道出血患者。
J Gastroenterol Hepatol. 2021 Jun;36(6):1590-1597. doi: 10.1111/jgh.15313. Epub 2021 Jan 25.
KDD. 2016 Aug;2016:855-864. doi: 10.1145/2939672.2939754.
4
Normalized names for clinical drugs: RxNorm at 6 years.临床药物的规范化名称:RxNorm 六年发展
J Am Med Inform Assoc. 2011 Jul-Aug;18(4):441-8. doi: 10.1136/amiajnl-2011-000116. Epub 2011 Apr 21.
5
The Unified Medical Language System (UMLS): integrating biomedical terminology.统一医学语言系统(UMLS):整合生物医学术语。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061.