1Division of Surgical Oncology, Department of Surgery, UT Southwestern Medical Center, Dallas, TX.
2Department of Surgery, Yale School of Medicine, New Haven, CT.
J Natl Compr Canc Netw. 2024 May 15;22(2 D):e237334. doi: 10.6004/jnccn.2023.7334.
Internet-based health education is increasingly vital in patient care. However, the readability of online information often exceeds the average reading level of the US population, limiting accessibility and comprehension. This study investigates the use of chatbot artificial intelligence to improve the readability of cancer-related patient-facing content.
We used ChatGPT 4.0 to rewrite content about breast, colon, lung, prostate, and pancreas cancer across 34 websites associated with NCCN Member Institutions. Readability was analyzed using Fry Readability Score, Flesch-Kincaid Grade Level, Gunning Fog Index, and Simple Measure of Gobbledygook. The primary outcome was the mean readability score for the original and artificial intelligence (AI)-generated content. As secondary outcomes, we assessed the accuracy, similarity, and quality using F1 scores, cosine similarity scores, and section 2 of the DISCERN instrument, respectively.
The mean readability level across the 34 websites was equivalent to a university freshman level (grade 13±1.5). However, after ChatGPT's intervention, the AI-generated outputs had a mean readability score equivalent to a high school freshman education level (grade 9±0.8). The overall F1 score for the rewritten content was 0.87, the precision score was 0.934, and the recall score was 0.814. Compared with their original counterparts, the AI-rewritten content had a cosine similarity score of 0.915 (95% CI, 0.908-0.922). The improved readability was attributed to simpler words and shorter sentences. The mean DISCERN score of the random sample of AI-generated content was equivalent to "good" (28.5±5), with no significant differences compared with their original counterparts.
Our study demonstrates the potential of AI chatbots to improve the readability of patient-facing content while maintaining content quality. The decrease in requisite literacy after AI revision emphasizes the potential of this technology to reduce health care disparities caused by a mismatch between educational resources available to a patient and their health literacy.
基于互联网的健康教育在患者护理中变得越来越重要。然而,在线信息的可读性往往超过了美国人口的平均阅读水平,限制了其可及性和理解度。本研究调查了使用聊天机器人人工智能来提高面向癌症患者的内容可读性。
我们使用 ChatGPT 4.0 改写了与 NCCN 成员机构相关的 34 个网站上关于乳腺癌、结肠癌、肺癌、前列腺癌和胰腺癌的内容。使用 Fry 可读性得分、Flesch-Kincaid 年级水平、Gunning Fog 指数和简单测词法来分析可读性。主要结果是原始内容和人工智能(AI)生成内容的平均可读性得分。作为次要结果,我们分别使用 F1 得分、余弦相似度得分和 DISCERN 工具第 2 部分评估准确性、相似性和质量。
34 个网站的平均阅读水平相当于大学新生水平(13 年级±1.5)。然而,在 ChatGPT 的干预下,AI 生成的输出的平均可读性得分相当于高中新生的教育水平(9 年级±0.8)。重写内容的总体 F1 得分为 0.87,精度得分为 0.934,召回率得分为 0.814。与原始内容相比,AI 重写内容的余弦相似度得分为 0.915(95%置信区间,0.908-0.922)。可读性的提高归因于更简单的单词和更短的句子。随机抽取的 AI 生成内容的平均 DISCERN 得分为“良好”(28.5±5),与原始内容相比无显著差异。
我们的研究表明,人工智能聊天机器人有可能在保持内容质量的同时提高面向患者的内容的可读性。在 AI 修订后所需的读写能力下降强调了这项技术的潜力,可以减少由于患者可获得的教育资源与其健康素养不匹配而导致的医疗保健差距。