Fazilat Alexander Z, Berry Charlotte E, Churukian Andrew, Lavin Christopher, Kameni Lionel, Brenac Camille, Podda Silvio, Bruckman Karl, Lorenz Hermann P, Khosla Rohit K, Wan Derrick C
Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, CA, USA.
Division of Plastic and Reconstructive Surgery, St. Joseph's Regional Medical Center, Paterson, NJ, USA.
Cleft Palate Craniofac J. 2024 Aug 1:10556656241266368. doi: 10.1177/10556656241266368.
The application of artificial intelligence (AI) in healthcare has expanded in recent years, and these tools such as ChatGPT to generate patient-facing information have garnered particular interest. Online cleft lip and palate (CL/P) surgical information supplied by academic/professional (A/P) sources was therefore evaluated against ChatGPT regarding accuracy, comprehensiveness, and clarity.
11 plastic and reconstructive surgeons and 29 non-medical individuals blindly compared responses written by ChatGPT or A/P sources to 30 frequently asked CL/P surgery questions. Surgeons indicated preference, determined accuracy, and scored comprehensiveness and clarity. Non-medical individuals indicated preference. Calculations of readability scores were determined using seven readability formulas. Statistical analysis of CL/P surgical online information was performed using paired t-tests.
Surgeons, 60.88% of the time, blindly preferred material generated by ChatGPT over A/P sources. Additionally, surgeons consistently indicated that ChatGPT-generated material was more comprehensive and had greater clarity. No significant difference was found between ChatGPT and resources provided by professional organizations in terms of accuracy. Among individuals with no medical background, ChatGPT-generated materials were preferred 60.46% of the time. For materials from both ChatGPT and A/P sources, readability scores surpassed advised levels for patient proficiency across seven readability formulas.
As the prominence of ChatGPT-based language tools rises in the healthcare space, potential applications of the tools should be assessed by experts against existing high-quality sources. Our results indicate that ChatGPT is capable of producing high-quality material in terms of accuracy, comprehensiveness, and clarity preferred by both plastic surgeons and individuals with no medical background.
近年来,人工智能(AI)在医疗保健领域的应用不断扩展,诸如ChatGPT这类用于生成面向患者信息的工具引发了特别关注。因此,针对准确性、全面性和清晰度,对学术/专业(A/P)来源提供的在线唇腭裂(CL/P)手术信息与ChatGPT进行了评估。
11名整形外科医生和29名非医学专业人员对ChatGPT或A/P来源针对30个常见CL/P手术问题给出的回答进行了盲法比较。外科医生表明偏好,确定准确性,并对全面性和清晰度进行评分。非医学专业人员表明偏好。使用七种可读性公式确定可读性得分。采用配对t检验对CL/P手术在线信息进行统计分析。
外科医生在60.88%的情况下,盲目地更青睐ChatGPT生成的内容而非A/P来源的内容。此外,外科医生一直表示ChatGPT生成的内容更全面且清晰度更高。在准确性方面,ChatGPT与专业组织提供的资源之间未发现显著差异。在没有医学背景的人群中,ChatGPT生成的材料在60.46%的情况下更受青睐。对于ChatGPT和A/P来源的材料,根据七种可读性公式,可读性得分均超过了针对患者理解能力建议的水平。
随着基于ChatGPT的语言工具在医疗保健领域的知名度不断提高,专家应根据现有的高质量来源评估这些工具的潜在应用。我们的结果表明,ChatGPT能够在准确性、全面性和清晰度方面生成整形外科医生和没有医学背景的人都青睐的高质量材料。