• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

学习对复杂生物医学假设进行排序以加速科学发现。

Learning to Rank Complex Biomedical Hypotheses for Accelerating Scientific Discovery.

作者信息

Ding Juncheng, Dahal Shailesh, Adhikari Bijaya, Jha Kishlay

机构信息

University of North Texas, Denton, TX, USA.

University of Iowa, Iowa City, IA, USA.

出版信息

Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:285-293. doi: 10.1109/ichi61247.2024.00044. Epub 2024 Aug 22.

DOI:10.1109/ichi61247.2024.00044
PMID:40109372
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11920884/
Abstract

Hypothesis generation (HG) is a fundamental problem in biomedical text mining that uncovers plausible implicit links ( terms) between two disjoint concepts of interest ( and terms). Over the past decade, many HG approaches based on distributional statistics, graph-theoretic measures, and supervised machine learning methods have been proposed. Despite significant advances made, the existing approaches have two major limitations. First, they mainly focus on enumerating hypotheses and often neglect to rank them in a semantically meaningful way. This leads to wasted time and resources as researchers may focus on hypotheses that are ultimately not supported by experimental evidence. Second, the existing approaches are designed to rank hypotheses with only one intermediate or evidence term (referred as simple hypotheses), and thus are unable to handle hypotheses with multiple intermediate terms (referred as complex hypotheses). This is limiting because recent research has shown that the complex hypotheses could be of greater practical value than simple ones, especially in the early stages of scientific discovery. To address these issues, we propose a new HG ranking approach that leverages upon the expressive power of Graph Neural Networks (GNN) coupled with a domain-knowledge guided Noise-Contrastive Estimation (NCE) strategy to effectively rank both simple and complex biomedical hypotheses. Specifically, the message passing capabilities of GNN allows our approach to capture the rich interactions between biomedical entities and succinctly handle the complex hypotheses with variable intermediate terms. Moreover, the proposed domain knowledge-guided NCE strategy enables the ranking of complex hypotheses based on their coherence with the established biomedical knowledge. Extensive experiment results on five recognized biomedical datasets show that the proposed approach consistently outperforms the existing baselines and prioritizes hypotheses worthy of potential clinical trials.

摘要

假设生成(HG)是生物医学文本挖掘中的一个基本问题,它揭示了两个不相关的感兴趣概念(术语和术语)之间可能存在的隐含联系。在过去十年中,已经提出了许多基于分布统计、图论度量和监督机器学习方法的HG方法。尽管取得了重大进展,但现有方法存在两个主要局限性。首先,它们主要侧重于枚举假设,往往忽略以语义有意义的方式对其进行排序。这导致了时间和资源的浪费,因为研究人员可能会关注最终未得到实验证据支持的假设。其次,现有方法旨在对只有一个中间或证据术语的假设进行排序(称为简单假设),因此无法处理具有多个中间术语的假设(称为复杂假设)。这是有局限性的,因为最近的研究表明,复杂假设可能比简单假设具有更大的实用价值,尤其是在科学发现的早期阶段。为了解决这些问题,我们提出了一种新的HG排序方法,该方法利用图神经网络(GNN)的表达能力,结合领域知识引导的噪声对比估计(NCE)策略,有效地对简单和复杂的生物医学假设进行排序。具体来说,GNN的消息传递能力使我们的方法能够捕捉生物医学实体之间丰富的相互作用,并简洁地处理具有可变中间术语的复杂假设。此外,所提出的领域知识引导的NCE策略能够根据复杂假设与已建立的生物医学知识的一致性对其进行排序。在五个公认的生物医学数据集上进行的大量实验结果表明,所提出的方法始终优于现有的基线方法,并对值得进行潜在临床试验的假设进行了优先排序。

相似文献

1
Learning to Rank Complex Biomedical Hypotheses for Accelerating Scientific Discovery.学习对复杂生物医学假设进行排序以加速科学发现。
Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:285-293. doi: 10.1109/ichi61247.2024.00044. Epub 2024 Aug 22.
2
An Integrated Fuzzy Neural Network and Topological Data Analysis for Molecular Graph Representation Learning and Property Forecasting.用于分子图表示学习和性质预测的集成模糊神经网络与拓扑数据分析
Mol Inform. 2025 Mar;44(3):e202400335. doi: 10.1002/minf.202400335.
3
Contrasting Multi-Source Temporal Knowledge Graphs for Biomedical Hypothesis Generation.用于生物医学假设生成的多源时态知识图谱对比
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2102-2112. doi: 10.1109/TCBB.2024.3451051. Epub 2024 Dec 10.
4
FuseLinker: Leveraging LLM's pre-trained text embeddings and domain knowledge to enhance GNN-based link prediction on biomedical knowledge graphs.FuseLinker:利用大语言模型的预训练文本嵌入和领域知识增强基于图神经网络的生物医学知识图谱的链接预测。
J Biomed Inform. 2024 Oct;158:104730. doi: 10.1016/j.jbi.2024.104730. Epub 2024 Sep 24.
5
GTC: GNN-Transformer co-contrastive learning for self-supervised heterogeneous graph representation.GTC:用于自监督异构图表示的GNN-Transformer协同对比学习
Neural Netw. 2025 Jan;181:106645. doi: 10.1016/j.neunet.2024.106645. Epub 2024 Aug 16.
6
Explicit and Implicit Feature Contrastive Learning Model for Knowledge Graph Link Prediction.用于知识图谱链接预测的显式和隐式特征对比学习模型
Sensors (Basel). 2024 Nov 18;24(22):7353. doi: 10.3390/s24227353.
7
A Topology-Enhanced Multi-Viewed Contrastive Approach for Molecular Graph Representation Learning and Classification.一种用于分子图表示学习和分类的拓扑增强多视图对比方法。
Mol Inform. 2025 Jan;44(1):e202400252. doi: 10.1002/minf.202400252.
8
Generative and contrastive graph representation learning with message passing.基于消息传递的生成式和对比式图表示学习
Neural Netw. 2025 May;185:107224. doi: 10.1016/j.neunet.2025.107224. Epub 2025 Feb 6.
9
CL-GNN: Contrastive Learning and Graph Neural Network for Protein-Ligand Binding Affinity Prediction.CL-GNN:用于蛋白质-配体结合亲和力预测的对比学习与图神经网络
J Chem Inf Model. 2025 Feb 24;65(4):1724-1735. doi: 10.1021/acs.jcim.4c01290. Epub 2025 Feb 6.
10
Temporal attention networks for biomedical hypothesis generation.基于时间注意力网络的生物医学假说生成。
J Biomed Inform. 2024 Mar;151:104607. doi: 10.1016/j.jbi.2024.104607. Epub 2024 Feb 14.

本文引用的文献

1
An automatic hypothesis generation for plausible linkage between xanthium and diabetes.自动生成黄麻与糖尿病之间可能存在关联的假设。
Sci Rep. 2022 Oct 20;12(1):17547. doi: 10.1038/s41598-022-20752-0.
2
Continual knowledge infusion into pre-trained biomedical language models.持续向预训练的生物医学语言模型中注入知识。
Bioinformatics. 2022 Jan 3;38(2):494-502. doi: 10.1093/bioinformatics/btab671.
3
Recent advances in biomedical literature mining.生物医学文献挖掘的最新进展。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa057.
4
A Comprehensive Survey on Graph Neural Networks.图神经网络综述。
IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24. doi: 10.1109/TNNLS.2020.2978386. Epub 2021 Jan 4.
5
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
6
BioWordVec, improving biomedical word embeddings with subword information and MeSH.BioWordVec,利用子词信息和 MeSH 改进生物医学词向量。
Sci Data. 2019 May 10;6(1):52. doi: 10.1038/s41597-019-0055-0.
7
A survey on literature based discovery approaches in biomedical domain.基于文献的生物医学领域发现方法研究综述。
J Biomed Inform. 2019 May;93:103141. doi: 10.1016/j.jbi.2019.103141. Epub 2019 Mar 9.
8
LION LBD: a literature-based discovery system for cancer biology.LION LBD:一个基于文献的癌症生物学发现系统。
Bioinformatics. 2019 May 1;35(9):1553-1561. doi: 10.1093/bioinformatics/bty845.
9
MOLIERE: Automatic Biomedical Hypothesis Generation System.莫里哀:自动生物医学假设生成系统。
KDD. 2017 Aug;2017:1633-1642. doi: 10.1145/3097983.3098057.
10
Supervised Learning Based Hypothesis Generation from Biomedical Literature.基于监督学习从生物医学文献中生成假设
Biomed Res Int. 2015;2015:698527. doi: 10.1155/2015/698527. Epub 2015 Aug 25.