Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.
School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
Stud Health Technol Inform. 2024 Aug 22;316:899-903. doi: 10.3233/SHTI240557.
Open source, lightweight and offline generative large language models (LLMs) hold promise for clinical information extraction due to their suitability to operate in secured environments using commodity hardware without token cost. By creating a simple lupus nephritis (LN) renal histopathology annotation schema and generating gold standard data, this study investigates prompt-based strategies using three state-of-the-art lightweight LLMs, namely BioMistral-DARE-7B (BioMistral), Llama-2-13B (Llama 2), and Mistral-7B-instruct-v0.2 (Mistral). We examine the performance of these LLMs within a zero-shot learning environment for renal histopathology report information extraction. Incorporating four prompting strategies, including combinations of batch prompt (BP), single task prompt (SP), chain of thought (CoT) and standard simple prompt (SSP), our findings indicate that both Mistral and BioMistral consistently demonstrated higher performance compared to Llama 2. Mistral recorded the highest performance, achieving an F1-score of 0.996 [95% CI: 0.993, 0.999] for extracting the numbers of various subtypes of glomeruli across all BP settings and 0.898 [95% CI: 0.871, 0.921] in extracting relational values of immune markers under the BP+SSP setting. This study underscores the capability of offline LLMs to provide accurate and secure clinical information extraction, which can serve as a promising alternative to their heavy-weight online counterparts.
开源、轻量级且离线的生成式大型语言模型(LLM)由于其适合在使用商品硬件的安全环境中运行,且无需令牌成本,因此有望用于临床信息提取。本研究通过创建一个简单的狼疮肾炎(LN)肾组织病理学注释方案并生成黄金标准数据,调查了三种最先进的轻量级 LLM (BioMistral-DARE-7B[BioMistral]、Llama-2-13B[Llama 2]和 Mistral-7B-instruct-v0.2[Mistral])的基于提示的策略。我们在零样本学习环境中检查了这些 LLM 对肾组织病理学报告信息提取的性能。本研究采用了四种提示策略,包括批处理提示(BP)、单一任务提示(SP)、思维链(CoT)和标准简单提示(SSP)的组合,研究结果表明,Mistral 和 BioMistral 与 Llama 2 相比,性能始终更高。Mistral 的性能最高,在所有 BP 设置下提取各种肾小球亚型数量的 F1 得分为 0.996 [95%CI:0.993,0.999],在 BP+SSP 设置下提取免疫标志物关系值的 F1 得分为 0.898 [95%CI:0.871,0.921]。本研究强调了离线 LLM 提供准确和安全的临床信息提取的能力,这可以作为其重型在线对应物的有前途的替代方案。