生成式人工智能在下颌骨骨折患者护理咨询中的应用

Utility of Generative Artificial Intelligence for Patient Care Counseling for Mandibular Fractures.

作者信息

Shaari Ariana L, Patil Disha P, Mohammed Saad, Salehi Parsa P

机构信息

Department of Head and Neck Surgery, Rutgers New Jersey Medical School.

Rutgers School of Dental Medicine, Newark, NJ.

出版信息

J Craniofac Surg. 2024 Nov 4. doi: 10.1097/SCS.0000000000010832.

DOI:10.1097/SCS.0000000000010832

PMID:39495556

Abstract

OBJECTIVE

To determine the readability and accuracy of information regarding mandible fractures generated by Chat Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4o.

BACKGROUND

Patients are increasingly turning to generative artificial intelligence to answer medical queries. To date, the accuracy and readability of responses regarding mandible fractures have not been assessed.

METHODS

Twenty patient questions regarding mandible fractures were developed by querying AlsoAsked (https://alsoasked.com), SearchResponse (https://searchresponse.io), and Answer the Public (https://answerthepublic.com/). Questions were posed to ChatGPT 3.5 and 4o. Readability was assessed by calculating the Flesch Kincaid Reading Ease, Flesch Kincaid Grade Level, number of sentences, and percentage of complex words. Accuracy was assessed by a board-certified facial plastic and reconstructive otolaryngologist using a 5-point Likert Scale.

RESULTS

No significant differences were observed between the two versions for readability or accuracy. Readability was above recommended levels for patient education materials. Accuracy was low, and a majority of responses were deemed inappropriate for patient use with multiple inaccuracies and/or missing information.

CONCLUSION

ChatGPT produced responses written at a high level inappropriate for the average patient, in addition to containing several inaccurate statements. Patients and clinicians should be aware of the limitations of generative artificial intelligence when seeking medical information regarding mandible fractures.

摘要

目的

确定由Chat Generative Pre-trained Transformer（ChatGPT）3.5版和4o版生成的有关下颌骨骨折信息的可读性和准确性。

背景

患者越来越多地借助生成式人工智能来解答医学问题。迄今为止，尚未评估有关下颌骨骨折的回答的准确性和可读性。

方法

通过查询AlsoAsked（https://alsoasked.com）、SearchResponse（https://searchresponse.io）和Answer the Public（https://answerthepublic.com/），提出了20个有关下颌骨骨折的患者问题。这些问题被提交给ChatGPT 3.5和4o。通过计算弗莱什-金凯德易读性、弗莱什-金凯德年级水平、句子数量和复杂单词百分比来评估可读性。准确性由一位获得委员会认证的面部整形和重建耳鼻喉科医生使用5点李克特量表进行评估。