Pandya Sumaarg, Alessandri Bonetti Mario, Liu Hilary Y, Jeong Tiffany, Ziembicki Jenny A, Egro Francesco M
Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, United States.
Department of Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, United States.
J Burn Care Res. 2025 Jan 6. doi: 10.1093/jbcr/irae211.
Patients often use Google for their medical questions. With the emergence of artificial intelligence large language models, such as ChatGPT, patients may turn to such technologies as an alternative source of medical information. This study investigates the safety, accuracy, and comprehensiveness of medical responses provided by ChatGPT in comparison to Google for common questions about burn injuries and their management. A Google search was performed using the term "burn," and the top ten frequently searched questions along with their answers were documented. These questions were then prompted into ChatGPT. The quality of responses from both Google and ChatGPT was evaluated by three burn and trauma surgeons using the Global Quality Score (GQS) scale, rating from 1 (poor quality) to 5 (excellent quality). A Wilcoxon paired t-test evaluated the difference in scores between Google and ChatGPT answers. Google answers scored an average of 2.80 ± 1.03, indicating that some information was present but important topics were missing. Conversely, ChatGPT-generated answers scored an average of 4.57 ± 0.73, indicating excellent quality responses with high utility to patients. For half of the questions, the surgeons unanimously preferred their patients receive information from ChatGPT. This study presents an initial comparison of Google and ChatGPT responses to commonly asked burn injury questions. ChatGPT outperforms Google in responding to commonly asked questions on burn injury and management based on the evaluations of three experienced burn surgeons. These results highlight the potential of ChatGPT as a source of patient education.
患者经常使用谷歌来查询医疗问题。随着诸如ChatGPT等人工智能大语言模型的出现,患者可能会转向此类技术作为医疗信息的替代来源。本研究调查了ChatGPT与谷歌相比,针对烧伤及其治疗的常见问题所提供的医疗回复的安全性、准确性和全面性。使用“烧伤”一词在谷歌上进行搜索,并记录了十大最常搜索的问题及其答案。然后将这些问题输入ChatGPT。由三位烧伤和创伤外科医生使用全球质量评分(GQS)量表对谷歌和ChatGPT的回复质量进行评估,评分从1(质量差)到5(质量优秀)。采用Wilcoxon配对t检验评估谷歌和ChatGPT答案之间的分数差异。谷歌的答案平均得分为2.80±1.03,表明存在一些信息,但缺少重要主题。相反,ChatGPT生成的答案平均得分为4.57±0.73,表明回复质量优秀,对患者有很高的实用性。对于一半的问题,外科医生一致倾向于让他们的患者从ChatGPT获取信息。本研究初步比较了谷歌和ChatGPT对常见烧伤问题的回复。根据三位经验丰富的烧伤外科医生的评估,ChatGPT在回答关于烧伤及其治疗的常见问题方面优于谷歌。这些结果凸显了ChatGPT作为患者教育来源的潜力。