人工智能驱动的患者支持：与妇科肿瘤学领域的医疗专业人员相比，评估ChatGPT-4在解答卵巢癌相关问题方面的有效性。

AI-driven patient support: Evaluating the effectiveness of ChatGPT-4 in addressing queries about ovarian cancer compared with healthcare professionals in gynecologic oncology.

作者信息

Chou Hung-Hsueh, Chen Yi Hua, Lin Chiu-Tzu, Chang Hsien-Tsung, Wu An-Chieh, Tsai Jia-Ling, Chen Hsiao-Wei, Hsu Ching-Chun, Liu Shu-Ya, Lee Jian Tao

机构信息

Department of Obstetrics and Gynecology, Linkou Branch, Chang Gung Memorial Hospital, Tao-Yuan, Taiwan.

School of Medicine, National Tsing Hua University, Hsinchu, Taiwan.

出版信息

Support Care Cancer. 2025 Apr 1;33(4):337. doi: 10.1007/s00520-025-09389-7.

DOI:10.1007/s00520-025-09389-7

PMID:40167802

Abstract

PURPOSE

Artificial intelligence (AI) chatbots, such as ChatGPT-4, allow a user to ask questions on an interactive level. This study evaluated the correctness and completeness of responses to questions about ovarian cancer from a GPT-4 chatbot, LilyBot, compared with responses from healthcare professionals in gynecologic cancer care.

METHODS

Fifteen categories of questions about ovarian cancer were collected from an online patient Chatgroup forum. Ten healthcare professionals in gynecologic oncology generated 150 questions and responses relative to these topics. Responses from LilyBot and the healthcare professionals were scored for correctness and completeness by eight independent healthcare professionals with similar backgrounds blinded to the identity of the responders. Differences between groups were analyzed with Mann-Whitney U and Kruskal-Wallis tests, followed by Tukey's post hoc comparisons.

RESULTS

Mean scores for overall performance for all 150 questions were significantly higher for LilyBot compared with the healthcare professionals for correctness (5.31 ± 0.98 vs. 5.07 ± 1.00, p = 0.017; range = 1-6) and completeness (2.66 ± 0.55 vs. 2.36 ± 0.55, p < 0.001; range = 1-3). LilyBot had significantly higher scores for immunotherapy compared with the healthcare professionals for correctness (6.00 ± 0.00 vs. 4.70 ± 0.48, p = 0.020) and completeness (3.00 ± 0.00 vs. 2.00 ± 0.00, p < 0.010); and gene therapy for completeness (3.00 ± 0.00 vs. 2.20 ± 0.42, p = 0.023).

CONCLUSIONS

The significantly better performance by LilyBot compared with healthcare professionals highlights the potential of ChatGPT-4-based dialogue systems to provide patients with clinical information about ovarian cancer.

摘要

目的

诸如ChatGPT-4之类的人工智能（AI）聊天机器人允许用户在交互层面上提问。本研究评估了GPT-4聊天机器人LilyBot对卵巢癌相关问题的回答的正确性和完整性，并与妇科癌症护理领域的医疗专业人员的回答进行了比较。

方法

从一个在线患者聊天群组论坛收集了15类关于卵巢癌的问题。10名妇科肿瘤学医疗专业人员针对这些主题生成了150个问题及回答。LilyBot和医疗专业人员的回答由8名背景相似且对回答者身份不知情的独立医疗专业人员进行正确性和完整性评分。组间差异采用Mann-Whitney U检验和Kruskal-Wallis检验进行分析，随后进行Tukey事后比较。

结果

在所有150个问题的整体表现方面，LilyBot在正确性（5.31±0.98 vs. 5.07±1.00，p = 0.017；范围 = 1 - 6）和完整性（2.66±0.55 vs. 2.36±0.55，p < 0.001；范围 = 1 - 3）上的平均得分显著高于医疗专业人员。在免疫治疗方面，LilyBot在正确性（6.00±0.00 vs. 4.70±0.48，p = 0.020）和完整性（3.00±0.00 vs. 2.00±0.00，p < 0.010）上的得分显著高于医疗专业人员；在基因治疗的完整性方面（3.00±0.00 vs. 2.20±0.42，p = 0.023）也是如此。

结论

LilyBot的表现明显优于医疗专业人员，这凸显了基于ChatGPT-4的对话系统为患者提供卵巢癌临床信息的潜力。

相似文献

AI-driven patient support: Evaluating the effectiveness of ChatGPT-4 in addressing queries about ovarian cancer compared with healthcare professionals in gynecologic oncology.人工智能驱动的患者支持：与妇科肿瘤学领域的医疗专业人员相比，评估ChatGPT-4在解答卵巢癌相关问题方面的有效性。

Support Care Cancer. 2025 Apr 1;33(4):337. doi: 10.1007/s00520-025-09389-7.

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study.人工智能聊天机器人在回应与创伤性牙损伤相关的患者咨询中的表现：一项比较研究。

Dent Traumatol. 2025 Jun;41(3):338-347. doi: 10.1111/edt.13020. Epub 2024 Nov 22.

Evaluation of artificial intelligence (AI) chatbots for providing sexual health information: a consensus study using real-world clinical queries.评估用于提供性健康信息的人工智能（AI）聊天机器人：一项使用真实临床问题的共识研究。

BMC Public Health. 2025 May 15;25(1):1788. doi: 10.1186/s12889-025-22933-8.

Comparative assessment of artificial intelligence chatbots' performance in responding to healthcare professionals' and caregivers' questions about Dravet syndrome.人工智能聊天机器人在回答医疗专业人员和护理人员有关德雷维特综合征问题时的性能比较评估。

Epilepsia Open. 2025 Apr 1. doi: 10.1002/epi4.70022.

Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性：ChatGPT与谷歌巴德人工智能的比较分析

Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Accuracy and Reliability of Chatbot Responses to Physician Questions.聊天机器人对医生提问回答的准确性和可靠性。

JAMA Netw Open. 2023 Oct 2;6(10):e2336483. doi: 10.1001/jamanetworkopen.2023.36483.

ChatGPT compared to national guidelines for management of ovarian cancer: Did ChatGPT get it right? - A Memorial Sloan Kettering Cancer Center Team Ovary study.ChatGPT 与卵巢癌管理的国家指南比较：ChatGPT 是否做对了？- 纪念斯隆凯特琳癌症中心卵巢癌团队研究。

Gynecol Oncol. 2024 Oct;189:75-79. doi: 10.1016/j.ygyno.2024.07.007. Epub 2024 Jul 22.

AI-Enhanced Healthcare: Integrating ChatGPT-4 in ePROs for Improved Oncology Care and Decision-Making: A Pilot Evaluation.人工智能增强的医疗保健：将ChatGPT-4集成到电子患者报告结果系统中以改善肿瘤护理和决策制定：一项试点评估。

Curr Oncol. 2024 Dec 26;32(1):7. doi: 10.3390/curroncol32010007.

Evaluating ChatGPT to test its robustness as an interactive information database of radiation oncology and to assess its responses to common queries from radiotherapy patients: A single institution investigation.评估ChatGPT以测试其作为放射肿瘤学交互式信息数据库的稳健性，并评估其对放疗患者常见问题的回答：一项单机构调查。

Cancer Radiother. 2024 Jun;28(3):258-264. doi: 10.1016/j.canrad.2023.11.005. Epub 2024 Jun 12.

本文引用的文献

Physician and Artificial Intelligence Chatbot Responses to Cancer Questions From Social Media.医生与人工智能聊天机器人对社交媒体上癌症问题的回复。

JAMA Oncol. 2024 Jul 1;10(7):956-960. doi: 10.1001/jamaoncol.2024.0836.

Physician Versus Large Language Model Chatbot Responses to Web-Based Questions From Autistic Patients in Chinese: Cross-Sectional Comparative Analysis.中文自闭症患者网络问诊中，医生与大型语言模型聊天机器人回复的对比分析：横断面研究。

J Med Internet Res. 2024 Apr 30;26:e54706. doi: 10.2196/54706.

Evaluation of large language models in breast cancer clinical scenarios: a comparative analysis based on ChatGPT-3.5, ChatGPT-4.0, and Claude2.评估大语言模型在乳腺癌临床场景中的应用：基于 ChatGPT-3.5、ChatGPT-4.0 和 Claude2 的比较分析

Int J Surg. 2024 Apr 1;110(4):1941-1950. doi: 10.1097/JS9.0000000000001066.

Infrastructural and public health awareness gaps for the diagnosis and treatment of ovarian cancer: A literature review.卵巢癌诊断与治疗中的基础设施及公共卫生意识差距：一项文献综述。

Arch Gynecol Obstet. 2024 May;309(5):1807-1813. doi: 10.1007/s00404-024-07371-y. Epub 2024 Feb 28.

Women with ovarian cancer's information seeking and avoidance behaviors: an interview study.卵巢癌女性的信息寻求与回避行为：一项访谈研究。

JAMIA Open. 2024 Feb 21;7(1):ooae011. doi: 10.1093/jamiaopen/ooae011. eCollection 2024 Apr.

Comparison of Answers between ChatGPT and Human Dieticians to Common Nutrition Questions.ChatGPT与人类营养师对常见营养问题的回答比较。

J Nutr Metab. 2023 Nov 7;2023:5548684. doi: 10.1155/2023/5548684. eCollection 2023.

Let's chat about cervical cancer: Assessing the accuracy of ChatGPT responses to cervical cancer questions.让我们来聊聊宫颈癌：评估 ChatGPT 对宫颈癌问题回答的准确性。

Gynecol Oncol. 2023 Dec;179:164-168. doi: 10.1016/j.ygyno.2023.11.008. Epub 2023 Nov 21.

Readability, accuracy, and appropriateness of ChatGPT 4.0 responses for use in patient education materials for Condyloma acuminatum.ChatGPT 4.0回复用于尖锐湿疣患者教育材料的可读性、准确性和适宜性。

Clin Dermatol. 2024 Jan-Feb;42(1):87-88. doi: 10.1016/j.clindermatol.2023.11.004. Epub 2023 Nov 10.

Answering head and neck cancer questions: An assessment of ChatGPT responses.回答头颈癌相关问题：对ChatGPT回答的评估。

Am J Otolaryngol. 2024 Jan-Feb;45(1):104085. doi: 10.1016/j.amjoto.2023.104085. Epub 2023 Oct 5.

Accuracy and Reliability of Chatbot Responses to Physician Questions.聊天机器人对医生提问回答的准确性和可靠性。

JAMA Netw Open. 2023 Oct 2;6(10):e2336483. doi: 10.1001/jamanetworkopen.2023.36483.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

人工智能驱动的患者支持：与妇科肿瘤学领域的医疗专业人员相比，评估ChatGPT-4在解答卵巢癌相关问题方面的有效性。

AI-driven patient support: Evaluating the effectiveness of ChatGPT-4 in addressing queries about ovarian cancer compared with healthcare professionals in gynecologic oncology.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献