ChatGPT 在肿瘤放疗患者和医护人员中的优势和劣势

Current Strengths and Weaknesses of ChatGPT as a Resource for Radiation Oncology Patients and Providers.

机构信息

Department of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas.

Department of Radiation Oncology, Duke University School of Medicine, Durham, North Carolina; Radiation Oncology Clinical Service, Durham VA Health Care System, Durham, North Carolina.

出版信息

Int J Radiat Oncol Biol Phys. 2024 Mar 15;118(4):905-915. doi: 10.1016/j.ijrobp.2023.10.020. Epub 2023 Oct 30.

DOI:10.1016/j.ijrobp.2023.10.020

PMID:39058798

Abstract

PURPOSE

Chat Generative Pre-Trained Transformer (ChatGPT), an artificial intelligence program that uses natural language processing to generate conversational-style responses to questions or inputs, is increasingly being used by both patients and health care professionals. This study aims to evaluate the accuracy and comprehensiveness of ChatGPT in radiation oncology-related domains, including answering common patient questions, summarizing landmark clinical research studies, and providing literature reviews with specific references supporting current standard-of-care clinical practice in radiation oncology.

METHODS AND MATERIALS

We assessed the performance of ChatGPT version 3.5 (ChatGPT3.5) in 3 areas. We evaluated ChatGPT3.5's ability to answer 28 templated patient-centered questions applied across 9 cancer types. We then tested ChatGPT3.5's ability to summarize specific portions of 10 landmark studies in radiation oncology. Next, we used ChatGPT3.5 to identify scientific studies supporting current standard-of-care practice in clinical radiation oncology for 5 different cancer types. Each response was graded independently by 2 reviewers, with discordant grades resolved by a third reviewer.

RESULTS

ChatGPT3.5 frequently generated inaccurate or incomplete responses. Only 39.7% of responses to patient-centered questions were considered correct and comprehensive. When summarizing landmark studies in radiation oncology, 35.0% of ChatGPT3.5's responses were accurate and comprehensive, improving to 43.3% when provided the full text of the study. ChatGPT3.5's ability to present a list of studies related to standard-of-care clinical practices was also unsatisfactory, with 50.6% of the provided studies fabricated.

CONCLUSIONS

ChatGPT should not be considered a reliable radiation oncology resource for patients or providers at this time, as it frequently generates inaccurate or incomplete responses. However, natural language programming-based artificial intelligence programs are rapidly evolving, and future versions of ChatGPT or similar programs may demonstrate improved performance in this domain.

摘要

目的

聊天生成预训练转换器（ChatGPT）是一种人工智能程序，它使用自然语言处理生成对话式响应来回答问题或输入，越来越多地被患者和医疗保健专业人员使用。本研究旨在评估 ChatGPT 在放射肿瘤学相关领域的准确性和全面性，包括回答常见的患者问题、总结具有里程碑意义的临床研究、以及提供具有特定参考文献的文献综述，以支持当前放射肿瘤学的标准临床实践。

方法和材料

我们评估了 ChatGPT 版本 3.5（ChatGPT3.5）在 3 个方面的性能。我们评估了 ChatGPT3.5 回答 28 个模板化以患者为中心的问题的能力，这些问题适用于 9 种癌症类型。然后，我们测试了 ChatGPT3.5 总结 10 项具有里程碑意义的放射肿瘤学研究特定部分的能力。接下来，我们使用 ChatGPT3.5 为 5 种不同癌症类型确定支持当前临床放射肿瘤学标准护理实践的科学研究。每个回复由 2 位审阅者独立评分，意见不一致的回复由第 3 位审阅者解决。

结果

ChatGPT3.5 经常生成不准确或不完整的回复。只有 39.7%的患者中心问题的回复被认为是正确和全面的。当总结放射肿瘤学中的里程碑研究时，ChatGPT3.5 的 35.0%的回复是准确和全面的，当提供研究的全文时，提高到 43.3%。ChatGPT3.5 提供与标准护理临床实践相关的研究列表的能力也不理想，提供的研究中有 50.6%是伪造的。

结论

目前，ChatGPT 不应被视为患者或提供者在放射肿瘤学方面的可靠资源，因为它经常生成不准确或不完整的回复。然而，基于自然语言编程的人工智能程序正在迅速发展，未来版本的 ChatGPT 或类似程序在这一领域可能会表现出更好的性能。

相似文献

Current Strengths and Weaknesses of ChatGPT as a Resource for Radiation Oncology Patients and Providers.ChatGPT 在肿瘤放疗患者和医护人员中的优势和劣势

Int J Radiat Oncol Biol Phys. 2024 Mar 15;118(4):905-915. doi: 10.1016/j.ijrobp.2023.10.020. Epub 2023 Oct 30.

The performance of artificial intelligence language models in board-style dental knowledge assessment: A preliminary study on ChatGPT.人工智能语言模型在棋盘式牙科知识评估中的表现：ChatGPT 的初步研究。

J Am Dent Assoc. 2023 Nov;154(11):970-974. doi: 10.1016/j.adaj.2023.07.016. Epub 2023 Sep 5.

Assessment of ChatGPT-3.5's Knowledge in Oncology: Comparative Study with ASCO-SEP Benchmarks.ChatGPT-3.5在肿瘤学领域知识的评估：与美国临床肿瘤学会-欧洲肿瘤内科学会基准的比较研究

JMIR AI. 2024 Jan 12;3:e50442. doi: 10.2196/50442.

Evaluating ChatGPT to test its robustness as an interactive information database of radiation oncology and to assess its responses to common queries from radiotherapy patients: A single institution investigation.评估ChatGPT以测试其作为放射肿瘤学交互式信息数据库的稳健性，并评估其对放疗患者常见问题的回答：一项单机构调查。

Cancer Radiother. 2024 Jun;28(3):258-264. doi: 10.1016/j.canrad.2023.11.005. Epub 2024 Jun 12.

Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer?ChatGPT在回答有关头颈癌的问题时准确可靠吗？

Front Oncol. 2023 Dec 1;13:1256459. doi: 10.3389/fonc.2023.1256459. eCollection 2023.

Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study.大型语言模型在回答免疫肿瘤学问题中的比较：一项横断面研究。

Oncologist. 2024 May 3;29(5):407-414. doi: 10.1093/oncolo/oyae009.

Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study.大型语言模型在回答免疫肿瘤学问题中的比较：一项横断面研究。

medRxiv. 2023 Oct 31:2023.10.31.23297825. doi: 10.1101/2023.10.31.23297825.

Use and Application of Large Language Models for Patient Questions Following Total Knee Arthroplasty.全膝关节置换术后患者问题的大语言模型应用与实践

J Arthroplasty. 2024 Sep;39(9):2289-2294. doi: 10.1016/j.arth.2024.03.017. Epub 2024 Mar 13.

ChatGPT versus NASS clinical guidelines for degenerative spondylolisthesis: a comparative analysis.ChatGPT 与 NASS 退行性脊柱滑脱临床指南比较分析。

Eur Spine J. 2024 Nov;33(11):4182-4203. doi: 10.1007/s00586-024-08198-6. Epub 2024 Mar 15.

Chat-GPT on brain tumors: An examination of Artificial Intelligence/Machine Learning's ability to provide diagnoses and treatment plans for example neuro-oncology cases.Chat-GPT 与脑肿瘤：人工智能/机器学习提供神经肿瘤学等案例诊断和治疗方案的能力评估。

Clin Neurol Neurosurg. 2024 Apr;239:108238. doi: 10.1016/j.clineuro.2024.108238. Epub 2024 Mar 9.

引用本文的文献

Development and evaluation of large-language models (LLMs) for oncology: A scoping review.用于肿瘤学的大语言模型的开发与评估：一项范围综述。

PLOS Digit Health. 2025 Aug 7;4(8):e0000980. doi: 10.1371/journal.pdig.0000980. eCollection 2025 Aug.

Generative AI/LLMs for Plain Language Medical Information for Patients, Caregivers and General Public: Opportunities, Risks and Ethics.用于为患者、护理人员和普通公众提供通俗易懂的医学信息的生成式人工智能/大型语言模型：机遇、风险与伦理

Patient Prefer Adherence. 2025 Jul 31;19:2227-2249. doi: 10.2147/PPA.S527922. eCollection 2025.

Large language model integrations in cancer decision-making: a systematic review and meta-analysis.大型语言模型在癌症决策中的应用：一项系统综述和荟萃分析。

NPJ Digit Med. 2025 Jul 17;8(1):450. doi: 10.1038/s41746-025-01824-7.

Feasibility study of automatic radiotherapy treatment planning for cervical cancer using a large language model.使用大语言模型进行宫颈癌自动放射治疗计划的可行性研究

Radiat Oncol. 2025 May 15;20(1):77. doi: 10.1186/s13014-025-02660-5.

A Narrative Review on the Application of Large Language Models to Support Cancer Care and Research.关于应用大语言模型支持癌症护理与研究的叙述性综述。

Yearb Med Inform. 2024 Aug;33(1):90-98. doi: 10.1055/s-0044-1800726. Epub 2025 Apr 8.

Current applications and challenges in large language models for patient care: a systematic review.用于患者护理的大语言模型的当前应用与挑战：一项系统综述

Commun Med (Lond). 2025 Jan 21;5(1):26. doi: 10.1038/s43856-024-00717-2.

Leveraging Large Language Models for Precision Monitoring of Chemotherapy-Induced Toxicities: A Pilot Study with Expert Comparisons and Future Directions.利用大语言模型进行化疗诱导毒性的精准监测：一项专家比较及未来方向的试点研究

Cancers (Basel). 2024 Aug 12;16(16):2830. doi: 10.3390/cancers16162830.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ChatGPT 在肿瘤放疗患者和医护人员中的优势和劣势

Current Strengths and Weaknesses of ChatGPT as a Resource for Radiation Oncology Patients and Providers.

机构信息

出版信息

PURPOSE

METHODS AND MATERIALS

RESULTS

CONCLUSIONS

目的

方法和材料

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献