Department of Medicine, (D.L, Y.C), University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA.
Department of Medicine, (D.L, Y.C), University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA; Palliative Research Center (PaRC) and Department of Medicine, (Y.S, T.H.T), Division of General Internal Medicine, Section of Palliative Care and Medical Ethics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA; UPMC Hillman Cancer Center, (Y.S, T.H.T), Pittsburgh Pennsylvania, USA.
J Pain Symptom Manage. 2024 Oct;68(4):e303-e311. doi: 10.1016/j.jpainsymman.2024.06.019. Epub 2024 Jun 26.
Artificial intelligence-driven tools, like ChatGPT, are prevalent sources for online health information. Limited research has explored the congruity between AI-generated content and professional treatment guidelines. This study seeks to compare recommendations for cancer-related symptoms generated from ChatGPT with guidelines from the National Comprehensive Cancer Network (NCCN).
We extracted treatment recommendations for nine symptoms from NCCN, separated into four full Supportive Care sections and five subsections of the Palliative Care webpage. We entered "How can I reduce my cancer-related [symptom]" into ChatGPT- 3.5 for these same symptoms and extracted its recommendations. A comparative content analysis focused on recommendations for medications, consultations, and non-pharmacological strategies. We compared word count and Flesch-Kincaid Grade Level (FKGL) readability for each NCCN and ChatGPT section.
The mean percent agreement between NCCN and ChatGPT recommendations was 37.3% (range 16.7%-81.8%). NCCN offered more specific medication recommendations. ChatGPT did recommend medications in the constipation and diarrhea sections that were not recommended by NCCN. Significant differences in word count (P=0.03) and FKGL (P<0.01) were found for NCCN Supportive Care webpages, with ChatGPT having lower word count and reading level. In the NCCN Palliative Care webpage subsections, there was no significant difference in word count (P=0.076), but FKGL was significantly lower with ChatGPT (P<0.01).
CONCLUSIONS/LESSONS LEARNED: While ChatGPT provides concise, accessible supportive care advice, discrepancies with guidelines raise concerns for patient-facing symptom management recommendations. Future research should consider how AI can be used in conjunction with evidence-based guidelines to support cancer patients' supportive care needs.
人工智能驱动的工具,如 ChatGPT,是在线健康信息的常见来源。有限的研究探索了 AI 生成的内容与专业治疗指南之间的一致性。本研究旨在比较 ChatGPT 生成的与癌症相关症状的建议与国家综合癌症网络 (NCCN) 的指南。
我们从 NCCN 中提取了九个症状的治疗建议,分为四个完整的支持性护理部分和姑息治疗网页的五个亚部分。我们对这些相同的症状,在 ChatGPT-3.5 中输入“我如何减轻与癌症相关的[症状]”,并提取其建议。比较内容分析重点关注药物、咨询和非药物策略的建议。我们比较了 NCCN 和 ChatGPT 每个部分的字数和 Flesch-Kincaid 阅读水平 (FKGL)。
NCCN 和 ChatGPT 建议的平均百分比一致性为 37.3%(范围为 16.7%-81.8%)。NCCN 提供了更具体的药物建议。ChatGPT 确实在便秘和腹泻部分推荐了 NCCN 不推荐的药物。NCCN 支持性护理网页的字数(P=0.03)和 FKGL(P<0.01)存在显著差异,ChatGPT 的字数和阅读水平较低。在 NCCN 姑息治疗网页的亚部分中,字数(P=0.076)没有显著差异,但 ChatGPT 的 FKGL 明显较低(P<0.01)。
结论/教训:虽然 ChatGPT 提供了简洁、易懂的支持性护理建议,但与指南的差异引起了对面向患者的症状管理建议的关注。未来的研究应考虑如何将人工智能与基于证据的指南结合使用,以满足癌症患者的支持性护理需求。