Suppr超能文献

学习对复杂生物医学假设进行排序以加速科学发现。

Learning to Rank Complex Biomedical Hypotheses for Accelerating Scientific Discovery.

作者信息

Ding Juncheng, Dahal Shailesh, Adhikari Bijaya, Jha Kishlay

机构信息

University of North Texas, Denton, TX, USA.

University of Iowa, Iowa City, IA, USA.

出版信息

Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:285-293. doi: 10.1109/ichi61247.2024.00044. Epub 2024 Aug 22.

Abstract

Hypothesis generation (HG) is a fundamental problem in biomedical text mining that uncovers plausible implicit links ( terms) between two disjoint concepts of interest ( and terms). Over the past decade, many HG approaches based on distributional statistics, graph-theoretic measures, and supervised machine learning methods have been proposed. Despite significant advances made, the existing approaches have two major limitations. First, they mainly focus on enumerating hypotheses and often neglect to rank them in a semantically meaningful way. This leads to wasted time and resources as researchers may focus on hypotheses that are ultimately not supported by experimental evidence. Second, the existing approaches are designed to rank hypotheses with only one intermediate or evidence term (referred as simple hypotheses), and thus are unable to handle hypotheses with multiple intermediate terms (referred as complex hypotheses). This is limiting because recent research has shown that the complex hypotheses could be of greater practical value than simple ones, especially in the early stages of scientific discovery. To address these issues, we propose a new HG ranking approach that leverages upon the expressive power of Graph Neural Networks (GNN) coupled with a domain-knowledge guided Noise-Contrastive Estimation (NCE) strategy to effectively rank both simple and complex biomedical hypotheses. Specifically, the message passing capabilities of GNN allows our approach to capture the rich interactions between biomedical entities and succinctly handle the complex hypotheses with variable intermediate terms. Moreover, the proposed domain knowledge-guided NCE strategy enables the ranking of complex hypotheses based on their coherence with the established biomedical knowledge. Extensive experiment results on five recognized biomedical datasets show that the proposed approach consistently outperforms the existing baselines and prioritizes hypotheses worthy of potential clinical trials.

摘要

假设生成(HG)是生物医学文本挖掘中的一个基本问题,它揭示了两个不相关的感兴趣概念(术语和术语)之间可能存在的隐含联系。在过去十年中,已经提出了许多基于分布统计、图论度量和监督机器学习方法的HG方法。尽管取得了重大进展,但现有方法存在两个主要局限性。首先,它们主要侧重于枚举假设,往往忽略以语义有意义的方式对其进行排序。这导致了时间和资源的浪费,因为研究人员可能会关注最终未得到实验证据支持的假设。其次,现有方法旨在对只有一个中间或证据术语的假设进行排序(称为简单假设),因此无法处理具有多个中间术语的假设(称为复杂假设)。这是有局限性的,因为最近的研究表明,复杂假设可能比简单假设具有更大的实用价值,尤其是在科学发现的早期阶段。为了解决这些问题,我们提出了一种新的HG排序方法,该方法利用图神经网络(GNN)的表达能力,结合领域知识引导的噪声对比估计(NCE)策略,有效地对简单和复杂的生物医学假设进行排序。具体来说,GNN的消息传递能力使我们的方法能够捕捉生物医学实体之间丰富的相互作用,并简洁地处理具有可变中间术语的复杂假设。此外,所提出的领域知识引导的NCE策略能够根据复杂假设与已建立的生物医学知识的一致性对其进行排序。在五个公认的生物医学数据集上进行的大量实验结果表明,所提出的方法始终优于现有的基线方法,并对值得进行潜在临床试验的假设进行了优先排序。

相似文献

1
Learning to Rank Complex Biomedical Hypotheses for Accelerating Scientific Discovery.学习对复杂生物医学假设进行排序以加速科学发现。
Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:285-293. doi: 10.1109/ichi61247.2024.00044. Epub 2024 Aug 22.
3
Contrasting Multi-Source Temporal Knowledge Graphs for Biomedical Hypothesis Generation.用于生物医学假设生成的多源时态知识图谱对比
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2102-2112. doi: 10.1109/TCBB.2024.3451051. Epub 2024 Dec 10.
10
Temporal attention networks for biomedical hypothesis generation.基于时间注意力网络的生物医学假说生成。
J Biomed Inform. 2024 Mar;151:104607. doi: 10.1016/j.jbi.2024.104607. Epub 2024 Feb 14.

本文引用的文献

3
Recent advances in biomedical literature mining.生物医学文献挖掘的最新进展。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa057.
4
A Comprehensive Survey on Graph Neural Networks.图神经网络综述。
IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24. doi: 10.1109/TNNLS.2020.2978386. Epub 2021 Jan 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验