Suppr超能文献

评估 ChatGPT 回答原发性震颤常见问题的能力。

Assessing ChatGPT Ability to Answer Frequently Asked Questions About Essential Tremor.

机构信息

Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana", Neuroscience Section, University of Salerno, Via Allende 43, 84081 Baronissi, SA, Italy.

Department of Neurology, "Umberto I"Hospital, Nocera Inferiore (SA), Italy.

出版信息

Tremor Other Hyperkinet Mov (N Y). 2024 Jul 3;14:33. doi: 10.5334/tohm.917. eCollection 2024.

Abstract

BACKGROUND

Large-language models (LLMs) driven by artificial intelligence allow people to engage in direct conversations about their health. The accuracy and readability of the answers provided by ChatGPT, the most famous LLM, about Essential Tremor (ET), one of the commonest movement disorders, have not yet been evaluated.

METHODS

Answers given by ChatGPT to 10 questions about ET were evaluated by 5 professionals and 15 laypeople with a score ranging from 1 (poor) to 5 (excellent) in terms of clarity, relevance, accuracy (only for professionals), comprehensiveness, and overall value of the response. We further calculated the readability of the answers.

RESULTS

ChatGPT answers received relatively positive evaluations, with median scores ranging between 4 and 5, by both groups and independently from the type of question. However, there was only moderate agreement between raters, especially in the group of professionals. Moreover, readability levels were poor for all examined answers.

DISCUSSION

ChatGPT provided relatively accurate and relevant answers, with some variability as judged by the group of professionals suggesting that the degree of literacy about ET has influenced the ratings and, indirectly, that the quality of information provided in clinical practice is also variable. Moreover, the readability of the answer provided by ChatGPT was found to be poor. LLMs will likely play a significant role in the future; therefore, health-related content generated by these tools should be monitored.

摘要

背景

人工智能驱动的大型语言模型(LLM)允许人们就其健康问题进行直接对话。ChatGPT 是最著名的 LLM 之一,关于最常见的运动障碍之一——特发性震颤(ET)的回答的准确性和可读性尚未得到评估。

方法

由 5 名专业人员和 15 名非专业人员对 ChatGPT 对 10 个关于 ET 的问题的回答进行评估,评分范围为 1(差)到 5(优),分别评估清晰度、相关性、准确性(仅针对专业人员)、全面性和回复的整体价值。我们进一步计算了回答的可读性。

结果

ChatGPT 的回答得到了相对积极的评价,两组人员和独立于问题类型的评分中位数均在 4 到 5 之间。然而,评分者之间的一致性较差,尤其是在专业人员组中。此外,所有检查的答案的可读性都较差。

讨论

ChatGPT 提供了相对准确和相关的答案,但一些由专业人员判断的答案存在差异,这表明对 ET 的了解程度会影响评分,间接地表明临床实践中提供的信息质量也存在差异。此外,ChatGPT 提供的答案的可读性较差。LLM 很可能在未来发挥重要作用;因此,应该对这些工具生成的与健康相关的内容进行监测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5376/11225576/7a1d0450541b/tohm-14-1-917-g1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验