Leiser Florian, Guse Richard, Sunyaev Ali
Research Group Critical Information Infrastructures, Institute of Applied Informatics and Formal Description Methods, Karlsruhe Institute of Technology, Karlsruhe, Germany.
Chair of Information Infrastructures, School of Computation, Information and Technology, Technical University of Munich, Campus Heilbronn, Heilbronn, Germany.
J Med Internet Res. 2025 Jun 19;27:e70315. doi: 10.2196/70315.
Large language models (LLMs) can support health care professionals in their daily work, for example, when writing and filing reports or communicating diagnoses. With the rise of LLMs, current research investigates how LLMs could be applied in medical practice and their benefits for physicians in clinical workflows. However, most studies neglect the importance of selecting suitable LLM architectures.
In this literature review, we aim to provide insights on the different LLM model architecture families (ie, Bidirectional Encoder Representations from Transformers [BERT]-based or generative pretrained transformer [GPT]-based models) used in previous research. We report on the suitability and benefits of different LLM model architecture families for various research foci.
To this end, we conduct a scoping review to identify which LLMs are used in health care. Our search included manuscripts from PubMed, arXiv, and medRxiv. We used open and selective coding to assess the 114 identified manuscripts regarding 11 dimensions related to usage and technical facets and the research focus of the manuscripts.
We identified 4 research foci that emerged previously in manuscripts, with LLM performance being the main focus. We found that GPT-based models are used for communicative purposes such as examination preparation or patient interaction. In contrast, BERT-based models are used for medical tasks such as knowledge discovery and model improvements.
Our study suggests that GPT-based models are better suited for communicative purposes such as report generation or patient interaction. BERT-based models seem to be better suited for innovative applications such as classification or knowledge discovery. This could be due to the architectural differences where GPT processes language unidirectionally and BERT bidirectionally, allowing more in-depth understanding of the text. In addition, BERT-based models seem to allow more straightforward extensions of their models for domain-specific tasks that generally lead to better results. In summary, health care professionals should consider the benefits and differences of the LLM architecture families when selecting a suitable model for their intended purpose.
大语言模型(LLMs)可以在医疗保健专业人员的日常工作中提供支持,例如在撰写和归档报告或传达诊断结果时。随着大语言模型的兴起,当前的研究探讨了大语言模型如何应用于医疗实践以及它们在临床工作流程中对医生的益处。然而,大多数研究忽视了选择合适的大语言模型架构的重要性。
在这篇文献综述中,我们旨在深入了解先前研究中使用的不同大语言模型架构家族(即基于变换器的双向编码器表征[BERT]或基于生成式预训练变换器[GPT]的模型)。我们报告不同大语言模型架构家族对于各种研究重点的适用性和益处。
为此,我们进行了一项范围综述,以确定医疗保健中使用了哪些大语言模型。我们的搜索包括来自PubMed、arXiv和medRxiv的手稿。我们使用开放式和选择性编码,从与使用、技术方面以及手稿研究重点相关的11个维度评估114篇已识别的手稿。
我们确定了先前在手稿中出现的4个研究重点,其中大语言模型性能是主要重点。我们发现基于GPT的模型用于诸如考试准备或患者互动等交流目的。相比之下,基于BERT的模型用于诸如知识发现和模型改进等医疗任务。
我们的研究表明,基于GPT的模型更适合诸如报告生成或患者互动等交流目的。基于BERT的模型似乎更适合诸如分类或知识发现等创新应用。这可能是由于架构差异,其中GPT单向处理语言,而BERT双向处理语言,从而允许对文本有更深入的理解。此外,基于BERT的模型似乎允许更直接地将其模型扩展到特定领域任务,通常会带来更好的结果。总之,医疗保健专业人员在为其预期目的选择合适的模型时应考虑大语言模型架构家族的益处和差异。