• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估先进人工智能作为一种工具在复发性/转移性头颈癌病例多学科肿瘤委员会决策中的作用——关于ChatGPT 4o的首项研究及与ChatGPT 4.0的比较。

Assessing the role of advanced artificial intelligence as a tool in multidisciplinary tumor board decision-making for recurrent/metastatic head and neck cancer cases - the first study on ChatGPT 4o and a comparison to ChatGPT 4.0.

作者信息

Schmidl Benedikt, Hütten Tobias, Pigorsch Steffi, Stögbauer Fabian, Hoch Cosima C, Hussain Timon, Wollenberg Barbara, Wirth Markus

机构信息

Department of Otolaryngology Head and Neck Surgery, Technical University Munich, Munich, Germany.

Department of RadioOncology, Technical University Munich, Munich, Germany.

出版信息

Front Oncol. 2024 Sep 5;14:1455413. doi: 10.3389/fonc.2024.1455413. eCollection 2024.

DOI:10.3389/fonc.2024.1455413
PMID:39301542
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11410764/
Abstract

BACKGROUND

Recurrent and metastatic head and neck squamous cell carcinoma (HNSCC) is characterized by a complex therapeutic management that needs to be discussed in multidisciplinary tumor boards (MDT). While artificial intelligence (AI) improved significantly to assist healthcare professionals in making informed treatment decisions for primary cases, an application in the even more complex recurrent/metastatic setting has not been evaluated yet. This study also represents the first evaluation of the recently published LLM ChatGPT 4o, compared to ChatGPT 4.0 for providing therapy recommendations.

METHODS

The therapy recommendations for 100 HNSCC cases generated by each LLM, 50 cases of recurrence and 50 cases of distant metastasis were evaluated by two independent reviewers. The primary outcome measured was the quality of the therapy recommendations measured by the following parameters: clinical recommendation, explanation, and summarization.

RESULTS

In this study, ChatGPT 4o and 4.0 provided mostly general answers for surgery, palliative care, or systemic therapy. ChatGPT 4o proved to be 48.5% faster than ChatGPT 4.0. For clinical recommendation, explanation, and summarization both LLMs obtained high scores in terms of performance of therapy recommendations, with no significant differences between both LLMs, but demonstrated to be mostly an assisting tool, requiring validation by an experienced clinician due to a lack of transparency and sometimes recommending treatment modalities that are not part of the current treatment guidelines.

CONCLUSION

This research demonstrates that ChatGPT 4o and 4.0 share a similar performance, while ChatGPT 4o is significantly faster. Since the current versions cannot tailor therapy recommendations, and sometimes recommend incorrect treatment options and lack information on the source material, advanced AI models at the moment can merely assist in the MDT setting for recurrent/metastatic HNSCC.

摘要

背景

复发性和转移性头颈部鳞状细胞癌(HNSCC)的治疗管理复杂,需要在多学科肿瘤委员会(MDT)中进行讨论。虽然人工智能(AI)已显著改进,可协助医疗保健专业人员为原发性病例做出明智的治疗决策,但在更为复杂的复发/转移情况下的应用尚未得到评估。本研究还首次对最近发布的大语言模型ChatGPT 4o与ChatGPT 4.0提供治疗建议进行了比较评估。

方法

由每个大语言模型生成的100例HNSCC病例的治疗建议,包括50例复发和50例远处转移,由两名独立评审员进行评估。测量主要结果是通过以下参数衡量的治疗建议质量:临床建议、解释和总结。

结果

在本研究中,ChatGPT 4o和4.0提供的大多是关于手术、姑息治疗或全身治疗的一般性答案。ChatGPT 4o被证明比ChatGPT 4.0快48.5%。对于临床建议、解释和总结,两个大语言模型在治疗建议性能方面均获得高分,两者之间无显著差异,但均显示主要是辅助工具,由于缺乏透明度且有时推荐不属于当前治疗指南的治疗方式,因此需要经验丰富的临床医生进行验证。

结论

本研究表明ChatGPT 4o和4.0表现相似,而ChatGPT 4o速度明显更快。由于当前版本无法定制治疗建议,有时推荐错误的治疗选择且缺乏关于源材料的信息,目前先进的人工智能模型仅能在MDT环境中辅助复发性/转移性HNSCC的治疗。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/c85fb52841ea/fonc-14-1455413-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/9cc44ead69b1/fonc-14-1455413-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/99af687a5401/fonc-14-1455413-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/8c772d1bc8da/fonc-14-1455413-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/562806ea5bb4/fonc-14-1455413-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/63c78ce7be49/fonc-14-1455413-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/c85fb52841ea/fonc-14-1455413-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/9cc44ead69b1/fonc-14-1455413-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/99af687a5401/fonc-14-1455413-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/8c772d1bc8da/fonc-14-1455413-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/562806ea5bb4/fonc-14-1455413-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/63c78ce7be49/fonc-14-1455413-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/686e/11410764/c85fb52841ea/fonc-14-1455413-g006.jpg

相似文献

1
Assessing the role of advanced artificial intelligence as a tool in multidisciplinary tumor board decision-making for recurrent/metastatic head and neck cancer cases - the first study on ChatGPT 4o and a comparison to ChatGPT 4.0.评估先进人工智能作为一种工具在复发性/转移性头颈癌病例多学科肿瘤委员会决策中的作用——关于ChatGPT 4o的首项研究及与ChatGPT 4.0的比较。
Front Oncol. 2024 Sep 5;14:1455413. doi: 10.3389/fonc.2024.1455413. eCollection 2024.
2
Assessing the role of advanced artificial intelligence as a tool in multidisciplinary tumor board decision-making for primary head and neck cancer cases.评估先进人工智能作为一种工具在多学科肿瘤委员会针对原发性头颈癌病例进行决策中的作用。
Front Oncol. 2024 May 24;14:1353031. doi: 10.3389/fonc.2024.1353031. eCollection 2024.
3
Assessing the use of the novel tool Claude 3 in comparison to ChatGPT 4.0 as an artificial intelligence tool in the diagnosis and therapy of primary head and neck cancer cases.评估新型工具 Claude 3 与 ChatGPT 4.0 作为原发性头颈部癌症病例诊断和治疗的人工智能工具的使用情况。
Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6099-6109. doi: 10.1007/s00405-024-08828-1. Epub 2024 Aug 7.
4
Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能:评估 Google Gemini 和 ChatGPT-4o。
Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.
5
Chasing sleep physicians: ChatGPT-4o on the interpretation of polysomnographic results.追寻睡眠医学专家:ChatGPT-4o对多导睡眠图结果的解读
Eur Arch Otorhinolaryngol. 2025 Mar;282(3):1631-1639. doi: 10.1007/s00405-024-08985-3. Epub 2024 Oct 20.
6
Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations.利用人工智能在减重手术中的应用:ChatGPT-4、Bing 和 Bard 在生成临床医生水平的减重手术建议方面的比较分析。
Surg Obes Relat Dis. 2024 Jul;20(7):603-608. doi: 10.1016/j.soard.2024.03.011. Epub 2024 Mar 24.
7
Reliability of large language models for advanced head and neck malignancies management: a comparison between ChatGPT 4 and Gemini Advanced.大型语言模型在高级头颈部恶性肿瘤管理中的可靠性:ChatGPT 4 与 Gemini Advanced 之间的比较。
Eur Arch Otorhinolaryngol. 2024 Sep;281(9):5001-5006. doi: 10.1007/s00405-024-08746-2. Epub 2024 May 25.
8
Accuracy of ChatGPT responses on tracheotomy for patient education.ChatGPT 回答在患者教育中关于气管切开术的准确性。
Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6167-6172. doi: 10.1007/s00405-024-08859-8. Epub 2024 Oct 2.
9
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.评估大语言模型在与《乳腺影像报告和数据系统》第5版相关问题上的文本和视觉诊断能力。
Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.
10
Assessing the feasibility of ChatGPT-4o and Claude 3-Opus in thyroid nodule classification based on ultrasound images.评估ChatGPT-4o和Claude 3-Opus基于超声图像进行甲状腺结节分类的可行性。
Endocrine. 2025 Mar;87(3):1041-1049. doi: 10.1007/s12020-024-04066-x. Epub 2024 Oct 11.

引用本文的文献

1
Assessing the value of artificial intelligence-based image analysis for pre-operative surgical planning of neck dissections and iENE detection in head and neck cancer patients.评估基于人工智能的图像分析对头颈部癌患者颈部 dissection 术前手术规划及 iENE 检测的价值。
Discov Oncol. 2025 May 30;16(1):956. doi: 10.1007/s12672-025-02798-4.
2
Evolving Artificial Intelligence (AI) at the Crossroads: Potentiating Productive vs. Declining Disruptive Cancer Research.处于十字路口的人工智能(AI)发展:增强富有成效的与日益减少的颠覆性癌症研究
Cancers (Basel). 2024 Oct 29;16(21):3646. doi: 10.3390/cancers16213646.
3
Artificial Intelligence in Head and Neck Cancer Diagnosis: A Comprehensive Review with Emphasis on Radiomics, Histopathological, and Molecular Applications.

本文引用的文献

1
Quality of ChatGPT-Generated Therapy Recommendations for Breast Cancer Treatment in Gynecology.ChatGPT 生成的乳腺癌治疗中妇科治疗建议的质量。
Curr Oncol. 2024 Jul 1;31(7):3845-3854. doi: 10.3390/curroncol31070284.
2
Advancements in AI-driven oncology: assessing ChatGPT's impact from GPT-3.5 to GPT-4o.人工智能驱动的肿瘤学进展:评估ChatGPT从GPT-3.5到GPT-4的影响
Int J Surg. 2025 Jan 1;111(1):1669-1670. doi: 10.1097/JS9.0000000000001989.
3
The latest version ChatGPT powered by GPT-4o: what will it bring to the medical field?
人工智能在头颈癌诊断中的应用:基于影像组学、组织病理学和分子应用的全面综述
Cancers (Basel). 2024 Oct 27;16(21):3623. doi: 10.3390/cancers16213623.
由GPT-4o驱动的最新版本ChatGPT:它将给医学领域带来什么?
Int J Surg. 2024 Sep 1;110(9):6018-6019. doi: 10.1097/JS9.0000000000001754.
4
Assessing the role of advanced artificial intelligence as a tool in multidisciplinary tumor board decision-making for primary head and neck cancer cases.评估先进人工智能作为一种工具在多学科肿瘤委员会针对原发性头颈癌病例进行决策中的作用。
Front Oncol. 2024 May 24;14:1353031. doi: 10.3389/fonc.2024.1353031. eCollection 2024.
5
Addressing the Black Box of AI-A Model and Research Agenda on the Co-constitution of Aging and Artificial Intelligence.探讨人工智能的黑箱问题——人工智能与衰老的共同构成及其研究议程
Gerontologist. 2024 Jun 1;64(6). doi: 10.1093/geront/gnae039.
6
Evaluation of large language models in breast cancer clinical scenarios: a comparative analysis based on ChatGPT-3.5, ChatGPT-4.0, and Claude2.评估大语言模型在乳腺癌临床场景中的应用:基于 ChatGPT-3.5、ChatGPT-4.0 和 Claude2 的比较分析
Int J Surg. 2024 Apr 1;110(4):1941-1950. doi: 10.1097/JS9.0000000000001066.
7
The Role of Large Language Models (LLMs) in Providing Triage for Maxillofacial Trauma Cases: A Preliminary Study.大语言模型在颌面创伤病例分诊中的作用:一项初步研究。
Diagnostics (Basel). 2024 Apr 18;14(8):839. doi: 10.3390/diagnostics14080839.
8
Human Papilloma Virus (HPV) driven oropharyngeal cancer in current or previous heavy smokers: should we look for a different treatment paradigm?人乳头瘤病毒(HPV)驱动的口咽癌在当前或既往重度吸烟者中:我们是否应该寻找不同的治疗模式?
Front Oncol. 2024 Apr 8;14:1383019. doi: 10.3389/fonc.2024.1383019. eCollection 2024.
9
Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage?人类智能与Chat-GPT:在分诊中对患者进行正确分类时谁表现得更好?
Am J Emerg Med. 2024 May;79:44-47. doi: 10.1016/j.ajem.2024.02.008. Epub 2024 Feb 7.
10
Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science.聊天机器人概述,特别强调医学领域中基于人工智能的ChatGPT
Front Artif Intell. 2023 Oct 31;6:1237704. doi: 10.3389/frai.2023.1237704. eCollection 2023.