增强病理学报告中临床数据提取能力：大语言模型的对比分析。

Enhancing Clinical Data Extraction from Pathology Reports: A Comparative Analysis of Large Language Models.

机构信息

Department of Medical Informatics, College of Medicine, The Catholic University of Korea, South Korea.

Department of Biomedicine and Health Sciences, South Korea.

出版信息

Stud Health Technol Inform. 2024 Aug 22;316:756-760. doi: 10.3233/SHTI240523.

DOI:10.3233/SHTI240523

PMID:39176904

Abstract

This study evaluates the efficacy of a small large language model (sLLM) in extracting critical information from free-text pathology reports across multiple centers, addressing the challenges posed by the narrative and complex nature of these documents. Employing three variants of the Llama 2 model, with 7 billion, 13 billion, and 70 billion parameters, the research assesses model performance in both zero-shot and five-shot settings, offering insights into the impact of example-based learning. A specialized information extraction tool utilizing regular expressions for pattern identification serves as the benchmark for evaluating the models' accuracy. Conducted within a hospital's internal environment, the study emphasizes the clinical applicability of these findings. The results reveal significant variations in model performance, with the 70 billion parameter model achieving remarkable accuracy in the five-shot scenario, demonstrating the potential of sLLMs in enhancing the efficiency and accuracy of data extraction from pathology reports. The study highlights the importance of example-driven learning and the trade-offs between model size, accuracy, hallucination rates, and processing time. These findings contribute to the ongoing efforts to integrate advanced language models into clinical settings, potentially transforming patient care and biomedical research by mitigating the limitations of manual data extraction processes.

摘要

本研究评估了小型大型语言模型（sLLM）在从多个中心的自由文本病理报告中提取关键信息的功效，解决了这些文档的叙述性和复杂性带来的挑战。该研究使用了三个变体的 Llama 2 模型，参数分别为 70 亿、130 亿和 700 亿，评估了模型在零样本和五样本设置下的性能，深入了解了基于示例学习的影响。一个利用正则表达式进行模式识别的专门信息提取工具被用作评估模型准确性的基准。该研究在医院内部环境中进行，强调了这些发现的临床适用性。结果显示模型性能存在显著差异，700 亿参数模型在五样本场景中取得了显著的准确性，表明 sLLM 有潜力提高从病理报告中提取数据的效率和准确性。该研究强调了示例驱动学习的重要性，以及模型大小、准确性、幻觉率和处理时间之间的权衡。这些发现为将先进的语言模型集成到临床环境中做出了贡献，通过减轻手动数据提取过程的限制，有可能改变患者护理和生物医学研究。