Suppr超能文献

评估大语言模型在回答常见患者胃肠道健康相关问题中的效用:我们做到了吗?

Evaluating the Utility of a Large Language Model in Answering Common Patients' Gastrointestinal Health-Related Questions: Are We There Yet?

作者信息

Lahat Adi, Shachar Eyal, Avidan Benjamin, Glicksberg Benjamin, Klang Eyal

机构信息

Chaim Sheba Medical Center, Department of Gastroenterology, Affiliated to Tel Aviv University, Tel Aviv 69978, Israel.

Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

出版信息

Diagnostics (Basel). 2023 Jun 2;13(11):1950. doi: 10.3390/diagnostics13111950.

Abstract

BACKGROUND AND AIMS

Patients frequently have concerns about their disease and find it challenging to obtain accurate Information. OpenAI's ChatGPT chatbot (ChatGPT) is a new large language model developed to provide answers to a wide range of questions in various fields. Our aim is to evaluate the performance of ChatGPT in answering patients' questions regarding gastrointestinal health.

METHODS

To evaluate the performance of ChatGPT in answering patients' questions, we used a representative sample of 110 real-life questions. The answers provided by ChatGPT were rated in consensus by three experienced gastroenterologists. The accuracy, clarity, and efficacy of the answers provided by ChatGPT were assessed.

RESULTS

ChatGPT was able to provide accurate and clear answers to patients' questions in some cases, but not in others. For questions about treatments, the average accuracy, clarity, and efficacy scores (1 to 5) were 3.9 ± 0.8, 3.9 ± 0.9, and 3.3 ± 0.9, respectively. For symptoms questions, the average accuracy, clarity, and efficacy scores were 3.4 ± 0.8, 3.7 ± 0.7, and 3.2 ± 0.7, respectively. For diagnostic test questions, the average accuracy, clarity, and efficacy scores were 3.7 ± 1.7, 3.7 ± 1.8, and 3.5 ± 1.7, respectively.

CONCLUSIONS

While ChatGPT has potential as a source of information, further development is needed. The quality of information is contingent upon the quality of the online information provided. These findings may be useful for healthcare providers and patients alike in understanding the capabilities and limitations of ChatGPT.

摘要

背景与目的

患者常常对自身疾病感到担忧,且发现获取准确信息颇具挑战性。OpenAI的ChatGPT聊天机器人(ChatGPT)是一款新开发的大型语言模型,旨在回答各个领域的广泛问题。我们的目的是评估ChatGPT在回答患者有关胃肠道健康问题方面的表现。

方法

为评估ChatGPT回答患者问题的表现,我们使用了110个现实生活问题的代表性样本。由三位经验丰富的胃肠病学家共同对ChatGPT给出的答案进行评分。评估ChatGPT给出答案的准确性、清晰度和有效性。

结果

ChatGPT在某些情况下能够为患者的问题提供准确清晰的答案,但在其他情况下则不然。对于治疗相关问题,平均准确性、清晰度和有效性得分(1至5分)分别为3.9±0.8、3.9±0.9和3.3±0.9。对于症状相关问题,平均准确性、清晰度和有效性得分分别为3.4±0.8、3.7±0.7和3.2±0.7。对于诊断测试相关问题,平均准确性、清晰度和有效性得分分别为3.7±1.7、3.7±1.8和3.5±1.7。

结论

虽然ChatGPT有作为信息来源的潜力,但仍需进一步发展。信息质量取决于所提供在线信息的质量。这些发现可能对医疗服务提供者和患者理解ChatGPT的能力与局限性都有用处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fdf3/10252924/b25e6a45c563/diagnostics-13-01950-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验