Bheemireddy Samhita, Leslie Sarah E, Durden Jakob A, Burnet George, Aryanpour Zain, Fong Ashlyn, Higgins Madeline G, Greenseid Samantha, McLemore Lauren, Li Gande, Miles Randy, Taft Nancy, Tevis Sarah
Albany Medical College, Albany Medical Center, Albany, NY, USA.
Adult and Child Center for Outcomes Research and Delivery Science (ACCORDS), University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
Ann Surg Oncol. 2025 Jul 21. doi: 10.1245/s10434-025-17860-2.
BACKGROUND: Patients have immediate access to their diagnostic reports but these reports exceed the recommended reading level for patient-facing materials. Generative artificial intelligence may be a tool for improving patient comprehension of health information. This study assessed the readability and accuracy of ChatGPT-simplified breast pathology reports. METHODS: Ten de-identified patient breast pathology reports were simplified by ChatGPT-4.0 using three different prompts. Prompt 1 requested simplification, Prompt 2 added a 6th-grade-level specification, and Prompt 3 requested essential information. The Flesch-Kincaid Reading Level (FKRL) and Flesch Reading Ease Score (FRES) were utilized to quantify readability and ease of reading, respectively. Five physicians used a four-point scale to assess factual correctness, relevancy, and fabrications to determine overall accuracy. Mean scores and standard deviations for FKRL, FRES, and accuracy scores were compared using analysis of variance (ANOVA) and t-tests. RESULTS: Prompt 2 demonstrated a reduction in FKRL (p < 0.001) and an increase in FRES (p < 0.001), demonstrating improved readability and ease of reading. ChatGPT-simplified reports received an overall accuracy score of 3.59/4 (standard deviation [SD] ± 0.17). The scores by rubric category were 3.62 (SD ± 0.31) for factual correctness (4 = completely correct), 3.27 (SD ± 0.44) for relevancy (4 = completely relevant), and 3.89 (SD ± 0.11) for fabricated information (4 = no fabricated information). CONCLUSIONS: ChatGPT simplified breast pathology reports to the reading level recommended for patient-facing materials when given a grade-level specification while mostly maintaining accuracy. To minimize the risk of medically inaccurate and/or misleading information, ChatGPT-simplified reports should be reviewed before dissemination.
背景:患者可立即获取其诊断报告,但这些报告超出了面向患者材料的推荐阅读水平。生成式人工智能可能是提高患者对健康信息理解的一种工具。本研究评估了ChatGPT简化的乳腺病理报告的可读性和准确性。 方法:ChatGPT-4.0使用三种不同提示对10份去识别化的患者乳腺病理报告进行简化。提示1要求简化,提示2添加了六年级水平的规范,提示3要求提供基本信息。分别使用弗莱施-金凯德阅读水平(FKRL)和弗莱施阅读易读性评分(FRES)来量化可读性和易读性。五名医生使用四点量表评估事实正确性、相关性和虚构内容,以确定总体准确性。使用方差分析(ANOVA)和t检验比较FKRL、FRES和准确性评分的平均分数和标准差。 结果:提示2显示FKRL降低(p < 0.001),FRES增加(p < 0.001),表明可读性和易读性得到改善。ChatGPT简化的报告总体准确性得分为3.59/4(标准差[SD]±0.17)。按评分标准类别划分的分数分别为:事实正确性3.62(SD±0.31)(4 = 完全正确)、相关性3.27(SD±0.44)(4 = 完全相关)、虚构信息3.89(SD±0.11)(4 = 无虚构信息)。 结论:当给出年级水平规范时,ChatGPT将乳腺病理报告简化到了面向患者材料推荐的阅读水平,同时基本保持了准确性。为了将医学上不准确和/或误导性信息的风险降至最低,ChatGPT简化的报告在传播前应进行审核。
Ann Surg Oncol. 2025-7-21
Clin Orthop Relat Res. 2023-11-1
Am J Obstet Gynecol. 2025-6-25
J Pediatr Ophthalmol Strabismus. 2025
Cancer. 2025-3-15
JAMA Netw Open. 2025-2-3
Health Care Sci. 2023-7-24
JAMA Netw Open. 2024-5-1
Bioengineering (Basel). 2024-3-29
Knee Surg Sports Traumatol Arthrosc. 2024-5
Eur Arch Otorhinolaryngol. 2024-6