Suppr超能文献

用于生物医学假设生成的多源时态知识图谱对比

Contrasting Multi-Source Temporal Knowledge Graphs for Biomedical Hypothesis Generation.

作者信息

Zhou Huiwei, Li Wenchu, Yao Weihong, Lin Yingyu, Du Lei

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2102-2112. doi: 10.1109/TCBB.2024.3451051. Epub 2024 Dec 10.

Abstract

Hypothesis Generation (HG) aims to expedite biomedical researches by generating novel hypotheses from existing scientific literature. Most existing studies focused on modeling static snapshots of the corpus, neglecting the temporal evolution of scientific terms. Despite recent efforts to learn term evolution from Knowledge Bases (KBs) for HG, the temporal information from multi-source KBs is still overlooked, which contains important, up-to-date knowledge. In this paper, an innovative Temporal Contrastive Learning (TCL) framework is introduced to uncover latent associations between entities by jointly modeling their co-evolution across multi-source temporal KBs. Specifically, we first construct a temporal relation graph based on PubMed papers and a biomedical relation database (such as Comparative Toxicogenomics Database (CTD)). Then the constructed temporal relation graph and a temporal concept graph (such as Medical Subject Headings (MeSH)) are used to train two GCN-based recurrent networks for learning the entity temporal evolutional embeddings, respectively. Finally, a cross-view temporal prediction task is designed for learning knowledge enriched temporal embeddings by contrasting the temporal embeddings learned from the two Temporal Knowledge Graphs (TKGs). Findings from experiments conducted on three real-world biomedical term relationship datasets demonstrate that the proposed approach is clearly superior to approaches based on single TKG, achieving the state-of-the-art performance.

摘要

假设生成(HG)旨在通过从现有科学文献中生成新的假设来加速生物医学研究。大多数现有研究专注于对语料库的静态快照进行建模,而忽略了科学术语的时间演变。尽管最近有人努力从知识库(KB)中学习术语演变以用于HG,但多源知识库中的时间信息仍然被忽视,而这些信息包含重要的最新知识。在本文中,我们引入了一种创新的时间对比学习(TCL)框架,通过联合建模实体在多源时间知识库中的共同演变来揭示实体之间的潜在关联。具体来说,我们首先基于PubMed论文和生物医学关系数据库(如比较毒理基因组学数据库(CTD))构建一个时间关系图。然后,使用构建的时间关系图和一个时间概念图(如医学主题词表(MeSH))分别训练两个基于GCN的循环网络,以学习实体的时间演变嵌入。最后,设计了一个跨视图时间预测任务,通过对比从两个时间知识图谱(TKG)中学到的时间嵌入来学习知识丰富的时间嵌入。在三个真实世界的生物医学术语关系数据集上进行的实验结果表明,所提出的方法明显优于基于单个TKG的方法,达到了当前的最优性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验