评估ChatGPT在解答甲状腺癌患者问题方面的能力：一项全面的混合方法评估。

Assessing ChatGPT's Capability in Addressing Thyroid Cancer Patient Queries: A Comprehensive Mixed-Methods Evaluation.

作者信息

Gorris Matthew A, Randle Reese W, Obermiller Corey S, Thomas Johnson, Toro-Tobon David, Dream Sophie Y, Fackelmayer Oliver J, Pandian T K, Mayson Sarah E

机构信息

Division of Endocrinology and Metabolism, Wake Forest University School of Medicine, Winston Salem, NC 27101, USA.

Department of Surgery, Section of Surgical Oncology, Wake Forest University School of Medicine, Winston Salem, NC 27101, USA.

出版信息

J Endocr Soc. 2025 Jan 13;9(2):bvaf003. doi: 10.1210/jendso/bvaf003. eCollection 2025 Jan 6.

DOI:10.1210/jendso/bvaf003

PMID:39881674

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11775116/

Abstract

CONTEXT

Literature suggests patients with thyroid cancer have unmet informational needs in many aspects of care. Patients often turn to online resources for their health-related information, and generative artificial intelligence programs such as ChatGPT are an emerging and attractive resource for patients.

OBJECTIVE

To assess the quality of ChatGPT's responses to thyroid cancer-related questions.

METHODS

Four endocrinologists and 4 endocrine surgeons, all with expertise in thyroid cancer, evaluated the responses to 20 thyroid cancer-related questions. Responses were scored on a 7-point Likert scale in areas of accuracy, completeness, and overall satisfaction. Comments from the evaluators were aggregated and a qualitative analysis was performed.

RESULTS

Overall, only 57%, 56%, and 52% of the responses "agreed" or "strongly agreed" that ChatGPT's answers were accurate, complete, and satisfactory, respectively. One hundred ninety-eight free-text comments were included in the qualitative analysis. The majority of comments were critical in nature. Several themes emerged, which included overemphasis of diet and iodine intake and its role in thyroid cancer, and incomplete or inaccurate information on risks of both thyroid surgery and radioactive iodine therapy.

CONCLUSION

Our study suggests that ChatGPT is not accurate or reliable enough at this time for unsupervised use as a patient information tool for thyroid cancer.

摘要

背景

文献表明，甲状腺癌患者在护理的许多方面都有未得到满足的信息需求。患者经常转向在线资源获取与健康相关的信息，而诸如ChatGPT之类的生成式人工智能程序对患者来说是一种新兴且有吸引力的资源。

目的

评估ChatGPT对甲状腺癌相关问题的回答质量。

方法

4名内分泌科医生和4名内分泌外科医生，均在甲状腺癌方面具有专业知识，对20个甲状腺癌相关问题的回答进行了评估。回答在准确性、完整性和总体满意度方面按照7分制李克特量表进行评分。汇总评估人员的意见并进行定性分析。

结果

总体而言，分别只有57%、56%和52%的回答“同意”或“强烈同意”ChatGPT的答案准确、完整且令人满意。定性分析纳入了198条自由文本评论。大多数评论性质上是批评性的。出现了几个主题，包括过度强调饮食和碘摄入量及其在甲状腺癌中的作用，以及关于甲状腺手术和放射性碘治疗风险的信息不完整或不准确。

结论

我们的研究表明，目前ChatGPT作为甲状腺癌患者信息工具在无监督情况下使用时不够准确或可靠。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2081/11775116/57ad5dd1631d/bvaf003f1.jpg

相似文献

Assessing ChatGPT's Capability in Addressing Thyroid Cancer Patient Queries: A Comprehensive Mixed-Methods Evaluation.评估ChatGPT在解答甲状腺癌患者问题方面的能力：一项全面的混合方法评估。

J Endocr Soc. 2025 Jan 13;9(2):bvaf003. doi: 10.1210/jendso/bvaf003. eCollection 2025 Jan 6.

A Novel Approach: Evaluating ChatGPT's Utility for the Management of Thyroid Nodules.一种新方法：评估ChatGPT在甲状腺结节管理中的效用。

Cureus. 2023 Oct 24;15(10):e47576. doi: 10.7759/cureus.47576. eCollection 2023 Oct.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Comparing ChatGPT's and Surgeon's Responses to Thyroid-related Questions From Patients.比较ChatGPT与外科医生对患者甲状腺相关问题的回答。

J Clin Endocrinol Metab. 2025 Feb 18;110(3):e841-e850. doi: 10.1210/clinem/dgae235.

ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。

Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.

The Role of Artificial Intelligence in Endocrine Management: Assessing ChatGPT's Responses to Prolactinoma Queries.人工智能在内分泌管理中的作用：评估ChatGPT对泌乳素瘤问题的回答。

J Pers Med. 2024 Mar 22;14(4):330. doi: 10.3390/jpm14040330.

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

Assessing the Accuracy of Generative Conversational Artificial Intelligence in Debunking Sleep Health Myths: Mixed Methods Comparative Study With Expert Analysis.评估生成式对话人工智能在破除睡眠健康误区方面的准确性：采用专家分析的混合方法比较研究

JMIR Form Res. 2024 Apr 16;8:e55762. doi: 10.2196/55762.

Challenging the Chatbot: An Assessment of ChatGPT's Diagnoses and Recommendations for DBP Case Studies.挑战聊天机器人：对 ChatGPT 对 DBP 病例研究的诊断和建议的评估。

J Dev Behav Pediatr. 2024 Jan 1;45(1):e8-e13. doi: 10.1097/DBP.0000000000001255. Epub 2024 Feb 9.

Evaluation of the accuracy and quality of ChatGPT-4 responses for hyperparathyroidism patients discussed at multidisciplinary endocrinology meetings.在多学科内分泌学会议上讨论的关于ChatGPT-4对甲状旁腺功能亢进患者的回复准确性和质量的评估。

Digit Health. 2024 Aug 28;10:20552076241278692. doi: 10.1177/20552076241278692. eCollection 2024 Jan-Dec.

本文引用的文献

A Systematic Review of Natural Language Processing Methods and Applications in Thyroidology.甲状腺学中自然语言处理方法与应用的系统评价

Mayo Clin Proc Digit Health. 2024 Jun;2(2):270-279. doi: 10.1016/j.mcpdig.2024.03.007. Epub 2024 May 21.

Comparing ChatGPT's and Surgeon's Responses to Thyroid-related Questions From Patients.比较ChatGPT与外科医生对患者甲状腺相关问题的回答。

J Clin Endocrinol Metab. 2025 Feb 18;110(3):e841-e850. doi: 10.1210/clinem/dgae235.

Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.2022 年全球癌症统计数据：全球 185 个国家和地区 36 种癌症的发病率和死亡率全球估计数。

CA Cancer J Clin. 2024 May-Jun;74(3):229-263. doi: 10.3322/caac.21834. Epub 2024 Apr 4.

The quality and readability of patient information provided by ChatGPT: can AI reliably explain common ENT operations?ChatGPT 提供的患者信息的质量和可读性：人工智能能可靠地解释常见的耳鼻喉科手术吗？

Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6147-6153. doi: 10.1007/s00405-024-08598-w. Epub 2024 Mar 26.

Generative AI in healthcare: an implementation science informed translational path on application, integration and governance.生成式人工智能在医疗保健领域的应用、整合和治理：基于实施科学的转化途径。

Implement Sci. 2024 Mar 15;19(1):27. doi: 10.1186/s13012-024-01357-9.

Decision Regret Following the Choice of Surgery or Active Surveillance for Small, Low-Risk Papillary Thyroid Cancer: A Prospective Cohort Study.小的低风险乳头状甲状腺癌手术或主动监测选择后的决策遗憾：一项前瞻性队列研究

Thyroid. 2024 May;34(5):626-634. doi: 10.1089/thy.2023.0634. Epub 2024 Apr 8.

Can ChatGPT help patients answer their otolaryngology questions?ChatGPT能帮助患者解答他们的耳鼻喉科问题吗？

Laryngoscope Investig Otolaryngol. 2023 Dec 9;9(1):e1193. doi: 10.1002/lio2.1193. eCollection 2024 Feb.

Assessment of Quality and Readability of Information Provided by ChatGPT in Relation to Anterior Cruciate Ligament Injury.ChatGPT提供的关于前交叉韧带损伤信息的质量和可读性评估

J Pers Med. 2024 Jan 18;14(1):104. doi: 10.3390/jpm14010104.

Assessing Fear of Thyroid Cancer in the General U.S. Population: A Cross-Sectional Study.评估美国普通人群对甲状腺癌的恐惧：一项横断面研究。

Thyroid. 2024 Feb;34(2):234-242. doi: 10.1089/thy.2023.0479. Epub 2024 Jan 12.

Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer?ChatGPT在回答有关头颈癌的问题时准确可靠吗？

Front Oncol. 2023 Dec 1;13:1256459. doi: 10.3389/fonc.2023.1256459. eCollection 2023.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

评估ChatGPT在解答甲状腺癌患者问题方面的能力：一项全面的混合方法评估。

Assessing ChatGPT's Capability in Addressing Thyroid Cancer Patient Queries: A Comprehensive Mixed-Methods Evaluation.

作者信息

机构信息

出版信息

CONTEXT

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献