Suppr超能文献

大型语言模型在医疗咨询中的性能评估:比较研究

Performance Assessment of Large Language Models in Medical Consultation: Comparative Study.

作者信息

Seo Sujeong, Kim Kyuli, Yang Heyoung

机构信息

Future Technology Analysis Center, Korea Institute of Science and Technology Information, Seoul, Republic of Korea.

Postal Savings & Insurance Development Institute, Seoul, Republic of Korea.

出版信息

JMIR Med Inform. 2025 Feb 12;13:e64318. doi: 10.2196/64318.

Abstract

BACKGROUND

The recent introduction of generative artificial intelligence (AI) as an interactive consultant has sparked interest in evaluating its applicability in medical discussions and consultations, particularly within the domain of depression.

OBJECTIVE

This study evaluates the capability of large language models (LLMs) in AI to generate responses to depression-related queries.

METHODS

Using the PubMedQA and QuoraQA data sets, we compared various LLMs, including BioGPT, PMC-LLaMA, GPT-3.5, and Llama2, and measured the similarity between the generated and original answers.

RESULTS

The latest general LLMs, GPT-3.5 and Llama2, exhibited superior performance, particularly in generating responses to medical inquiries from the PubMedQA data set.

CONCLUSIONS

Considering the rapid advancements in LLM development in recent years, it is hypothesized that version upgrades of general LLMs offer greater potential for enhancing their ability to generate "knowledge text" in the biomedical domain compared with fine-tuning for the biomedical field. These findings are expected to contribute significantly to the evolution of AI-based medical counseling systems.

摘要

背景

近期生成式人工智能(AI)作为交互式咨询工具的引入,引发了人们对评估其在医学讨论和咨询中适用性的兴趣,尤其是在抑郁症领域。

目的

本研究评估人工智能中的大语言模型(LLMs)对抑郁症相关问题生成回答的能力。

方法

使用PubMedQA和QuoraQA数据集,我们比较了各种大语言模型,包括BioGPT、PMC-LLaMA、GPT-3.5和Llama2,并测量了生成答案与原始答案之间的相似度。

结果

最新的通用大语言模型GPT-3.5和Llama2表现出卓越的性能,尤其是在生成对PubMedQA数据集中医学问题的回答方面。

结论

考虑到近年来大语言模型发展的快速进步,据推测,与针对生物医学领域进行微调相比,通用大语言模型的版本升级在增强其在生物医学领域生成“知识文本”能力方面具有更大潜力。这些发现有望对基于人工智能的医学咨询系统的发展做出重大贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7102/11888074/3b154431e251/medinform_v13i1e64318_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验