Suppr超能文献

韩国基于 ChatGPT 的患者友好型出院小结:软件开发与验证。

Patient-Friendly Discharge Summaries in Korea Based on ChatGPT: Software Development and Validation.

机构信息

College of Nursing, Yonsei University, Seoul, Korea.

Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Korea.

出版信息

J Korean Med Sci. 2024 Apr 29;39(16):e148. doi: 10.3346/jkms.2024.39.e148.

Abstract

BACKGROUND

Although discharge summaries in patient-friendly language can enhance patient comprehension and satisfaction, they can also increase medical staff workload. Using a large language model, we developed and validated software that generates a patient-friendly discharge summary.

METHODS

We developed and tested the software using 100 discharge summary documents, 50 for patients with myocardial infarction and 50 for patients treated in the Department of General Surgery. For each document, three new summaries were generated using three different prompting methods (Zero-shot, One-shot, and Few-shot) and graded using a 5-point Likert Scale regarding factuality, comprehensiveness, usability, ease, and fluency. We compared the effects of different prompting methods and assessed the relationship between input length and output quality.

RESULTS

The mean overall scores differed across prompting methods (4.19 ± 0.36 in Few-shot, 4.11 ± 0.36 in One-shot, and 3.73 ± 0.44 in Zero-shot; < 0.001). Post-hoc analysis indicated that the scores were higher with Few-shot and One-shot prompts than in zero-shot prompts, whereas there was no significant difference between Few-shot and One-shot prompts. The overall proportion of outputs that scored ≥ 4 was 77.0% (95% confidence interval: 68.8-85.3%), 70.0% (95% confidence interval [CI], 61.0-79.0%), and 32.0% (95% CI, 22.9-41.1%) with Few-shot, One-shot, and Zero-shot prompts, respectively. The mean factuality score was 4.19 ± 0.60 with Few-shot, 4.20 ± 0.55 with One-shot, and 3.82 ± 0.57 with Zero-shot prompts. Input length and the overall score showed negative correlations in the Zero-shot ( = -0.437, < 0.001) and One-shot ( = -0.327, < 0.001) tests but not in the Few-shot ( = -0.050, = 0.625) tests.

CONCLUSION

Large-language models utilizing Few-shot prompts generally produce acceptable discharge summaries without significant misinformation. Our research highlights the potential of such models in creating patient-friendly discharge summaries for Korean patients to support patient-centered care.

摘要

背景

虽然以患者友好的语言书写的出院小结可以提高患者的理解和满意度,但也会增加医务人员的工作量。我们使用大型语言模型开发并验证了一种生成患者友好型出院小结的软件。

方法

我们使用 100 份出院小结文档(50 份心肌梗死患者,50 份普外科患者)开发并测试了该软件。对于每份文档,我们使用三种不同的提示方法(零样本、单样本和少样本)生成了三个新的摘要,并使用 5 分李克特量表对其事实性、全面性、可用性、易用性和流畅性进行评分。我们比较了不同提示方法的效果,并评估了输入长度与输出质量之间的关系。

结果

不同提示方法的总分存在差异(少样本提示方法为 4.19 ± 0.36,单样本提示方法为 4.11 ± 0.36,零样本提示方法为 3.73 ± 0.44;< 0.001)。事后分析表明,少样本提示方法和单样本提示方法的评分均高于零样本提示方法,而少样本提示方法和单样本提示方法之间的评分无显著差异。少样本提示方法、单样本提示方法和零样本提示方法的≥4 分输出比例分别为 77.0%(95%置信区间:68.8%至 85.3%)、70.0%(95%置信区间:61.0%至 79.0%)和 32.0%(95%置信区间:22.9%至 41.1%)。少样本提示方法、单样本提示方法和零样本提示方法的事实性平均得分为 4.19 ± 0.60、4.20 ± 0.55 和 3.82 ± 0.57。零样本( = -0.437,< 0.001)和单样本( = -0.327,< 0.001)测试中输入长度与总分呈负相关,但在少样本( = -0.050, = 0.625)测试中不相关。

结论

利用少样本提示的大型语言模型通常可以生成没有明显错误信息的可接受的出院小结。我们的研究强调了这些模型在为韩国患者创建患者友好型出院小结以支持以患者为中心的护理方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a29/11058343/ace47cb2129f/jkms-39-e148-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验