Keating Muireann, Bollard Stephanie M, Potter Shirley
Department of Plastic and Reconstructive Surgery, St James's Hospital, Dublin, IRL.
School of Medicine, University College Dublin, Dublin, IRL.
Cureus. 2024 Nov 17;16(11):e73874. doi: 10.7759/cureus.73874. eCollection 2024 Nov.
INTRODUCTION: Within plastic surgery, a patient's most commonly used first point of information before consulting a surgeon is the internet. Free-to-use artificial intelligence (AI) websites like ChatGPT (Generative Pre-trained Transformers) are attractive applications for patient information due to their ability to instantaneously answer almost any query. Although relatively new, ChatGPT is now one of the most popular artificial intelligence conversational software tools. The aim of this study was to evaluate the quality and readability of information given by ChatGPT-4 on key areas in plastic and reconstructive surgery. METHODS: The ten plastic and aesthetic surgery topics with the highest worldwide search volume in the 15 years were identified. These were rephrased into question format to create nine individual questions. These questions were then input into ChatGPT-4. The response quality was assessed using the DISCERN. The readability and grade reading level of the responses were calculated using the Flesch-Kincaid Reading Ease Index and Coleman-Liau Index. Twelve physicians working in a plastic and reconstructive surgery unit were asked to rate the clarity and accuracy of the answers on a scale of 1-10 and state 'yes or no' if they would share the generated response with a patient. RESULTS: All answers were scored as poor or very poor according to the DISCERN tool. The mean DISCERN score for all questions was 34. The responses also scored low in readability and understandability. The mean FKRE index was 33.6, and the CL index was 15.6. Clinicians working in plastics and reconstructive surgery rated the questions well in clarity and accuracy. The mean clarity score was 7.38, and the accuracy score was 7.4. CONCLUSION: This study found that according to validated quality assessment tools, ChatGPT-4 produced low-quality information when asked about popular queries relating to plastic and aesthetic surgery. Furthermore, the information produced was pitched at a high reading level. However, the responses were still rated well in clarity and accuracy, according to clinicians working in plastic surgery. Although improvements need to be made, this study suggests that language models such as ChatGPT could be a useful starting point when developing written health information. With the expansion of AI, improvements in content quality are anticipated.
引言:在整形外科领域,患者在咨询外科医生之前最常用的信息来源是互联网。像ChatGPT(生成式预训练变换器)这样的免费人工智能网站因其能够即时回答几乎任何问题,成为患者获取信息的有吸引力的应用程序。虽然相对较新,但ChatGPT现在是最受欢迎的人工智能对话软件工具之一。本研究的目的是评估ChatGPT-4在整形与重建外科关键领域提供的信息质量和可读性。 方法:确定了过去15年全球搜索量最高的十个整形与美容外科主题。将这些主题重新表述为问题格式,形成九个单独的问题。然后将这些问题输入ChatGPT-4。使用DISCERN评估回答质量。使用弗莱什-金凯德阅读简易度指数和科尔曼-廖指数计算回答的可读性和年级阅读水平。邀请了十二位在整形与重建外科科室工作的医生对答案的清晰度和准确性进行1至10分的评分,并说明他们是否会与患者分享生成的回答。 结果:根据DISCERN工具,所有答案的评分均为差或非常差。所有问题的平均DISCERN评分为34分。回答在可读性和可理解性方面得分也很低。平均弗莱什-金凯德阅读简易度指数为33.6,科尔曼-廖指数为15.6。从事整形与重建外科工作的临床医生对问题的清晰度和准确性评分较高。平均清晰度评分为7.38,准确性评分为7.4。 结论:本研究发现,根据经过验证的质量评估工具,当被问及与整形和美容外科相关的常见问题时,ChatGPT-4提供的信息质量较低。此外,所提供的信息阅读水平较高。然而,根据整形外科临床医生的评价,回答在清晰度和准确性方面仍得到较高评分。尽管需要改进,但本研究表明,像ChatGPT这样的语言模型在开发书面健康信息时可能是一个有用的起点。随着人工智能的扩展,预计内容质量会有所提高。
J Reconstr Microsurg. 2024-11
J Med Internet Res. 2024-8-14
Aesthetic Plast Surg. 2025-8-11
Ulus Travma Acil Cerrahi Derg. 2025-3
Healthcare (Basel). 2024-12-31
Front Artif Intell. 2023-4-5
Skeletal Radiol. 2023-9
J Family Med Prim Care. 2019-7