Suppr超能文献

提示工程对生物医学文献中蛋白质-蛋白质相互作用识别的大语言模型的影响。

The influence of prompt engineering on large language models for protein-protein interaction identification in biomedical literature.

作者信息

Chang Yung-Chun, Huang Ming-Siang, Huang Yi-Hsuan, Lin Yi-Hsuan

机构信息

Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan.

Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.

出版信息

Sci Rep. 2025 May 3;15(1):15493. doi: 10.1038/s41598-025-99290-4.

Abstract

Identifying protein-protein interactions (PPIs) is a foundational task in biomedical natural language processing. While specialized models have been developed, the potential of general-domain large language models (LLMs) in PPI extraction, particularly for researchers without computational expertise, remains unexplored. This study evaluates the effectiveness of proprietary LLMs (GPT-3.5, GPT-4, and Google Gemini) in PPI prediction through systematic prompt engineering. We designed six prompting scenarios of increasing complexity, from basic interaction queries to sophisticated entity-tagged formats, and assessed model performance across multiple benchmark datasets (LLL, IEPA, HPRD50, AIMed, BioInfer, and PEDD). Carefully designed prompts effectively guided LLMs in PPI prediction. Gemini 1.5 Pro achieved the highest performance across most datasets, with notable F-scores in LLL (90.3%), IEPA (68.2%), HPRD50 (67.5%), and PEDD (70.2%). GPT-4 showed competitive performance, particularly in the LLL dataset (87.3%). We identified and addressed a positive prediction bias, demonstrating improved performance after evaluation refinement. While not surpassing specialized models, general-purpose LLMs with appropriate prompting strategies can effectively perform PPI prediction tasks, offering valuable tools for biomedical researchers without extensive computational expertise.

摘要

识别蛋白质-蛋白质相互作用(PPI)是生物医学自然语言处理中的一项基础任务。虽然已经开发了专门的模型,但通用领域的大语言模型(LLM)在PPI提取方面的潜力,尤其是对于没有计算专业知识的研究人员来说,仍未得到探索。本研究通过系统的提示工程评估了专有LLM(GPT-3.5、GPT-4和谷歌Gemini)在PPI预测中的有效性。我们设计了六种复杂度不断增加的提示场景,从基本的相互作用查询到复杂的实体标记格式,并在多个基准数据集(LLL、IEPA、HPRD50、AIMed、BioInfer和PEDD)上评估了模型性能。精心设计的提示有效地引导了LLM进行PPI预测。Gemini 1.5 Pro在大多数数据集上取得了最高性能,在LLL(90.3%)、IEPA(68.2%)、HPRD50(67.5%)和PEDD(70.2%)中获得了显著的F分数。GPT-4表现出有竞争力的性能,特别是在LLL数据集(87.3%)中。我们识别并解决了正预测偏差,在评估优化后性能有所提高。虽然没有超过专门的模型,但具有适当提示策略的通用LLM可以有效地执行PPI预测任务,为没有广泛计算专业知识的生物医学研究人员提供有价值的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0ff/12049485/a560b6821bf1/41598_2025_99290_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验