利用多模态超声成像探索ChatGPT-4o在甲状腺结节诊断中的潜力：双模态与三模态方法

Exploring the Potential of ChatGPT-4o in Thyroid Nodule Diagnosis Using Multi-Modality Ultrasound Imaging: Dual- vs. Triple-Modality Approaches.

作者信息

Chen Ziman, Chambara Nonhlanhla, Liu Shirley Yuk Wah, Chow Tom Chi Man, Lai Carol Man Sze, Ying Michael Tin Cheung

机构信息

Department of Health Technology and Informatics, The Hong Kong Polytechnic University, 11 Yuk Choi Rd., Hung Hom, Kowloon, Hong Kong.

School of Healthcare Sciences, Cardiff University, Cardiff CF14 4XN, UK.

出版信息

Cancers (Basel). 2025 Jun 20;17(13):2068. doi: 10.3390/cancers17132068.

DOI:10.3390/cancers17132068

PMID:40647374

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12249084/

Abstract

Recent advancements in large language models, such as ChatGPT-4o, have created new opportunities for analyzing complex multi-modal data, including medical images. This study aims to assess the potential of ChatGPT-4o in distinguishing between benign and malignant thyroid nodules via multi-modality ultrasound imaging: grayscale ultrasound, color Doppler ultrasound (CDUS), and shear wave elastography (SWE). Patients who underwent thyroid nodule ultrasound examinations and had confirmed pathological diagnoses were included. ChatGPT-4o analyzed the multi-modality ultrasound data using two approaches: (1.) a dual-modality strategy which employed grayscale ultrasound and CDUS, and (2.) a triple-modality strategy which incorporated grayscale ultrasound, CDUS, and SWE. The diagnostic performance was compared against pathological findings utilizing receiver operating characteristic (ROC) curve analysis, while consistency was evaluated through analysis. A total of 106 thyroid nodules were evaluated; 65.1% were benign and 34.9% malignant. In the dual-modality approach, ChatGPT-4o achieved an area under the ROC curve (AUC) of 66.3%, moderate agreement with pathology results ( = 0.298), a sensitivity of 70.3%, a specificity of 62.3%, and an accuracy of 65.1%. Conversely, the triple-modality approach exhibited higher specificity at 97.1% but lower sensitivity at 18.9%, with an accuracy of 69.8% and a reduced overall agreement ( = 0.194), resulting in an AUC of 58.0%. ChatGPT-4o exhibits potential, to some extent, in classifying thyroid nodules using multi-modality ultrasound imaging. However, the dual-modality approach unexpectedly outperforms the triple-modality approach. This indicates that ChatGPT-4o might encounter challenges in integrating and prioritizing different data modalities, particularly when conflicting information is present, which could impact diagnostic effectiveness.

摘要

诸如ChatGPT-4o等大语言模型的最新进展为分析包括医学图像在内的复杂多模态数据创造了新机会。本研究旨在评估ChatGPT-4o通过多模态超声成像（灰度超声、彩色多普勒超声（CDUS）和剪切波弹性成像（SWE））区分良性和恶性甲状腺结节的潜力。纳入了接受甲状腺结节超声检查并已确诊病理诊断的患者。ChatGPT-4o使用两种方法分析多模态超声数据：（1）采用灰度超声和CDUS的双模态策略，以及（2）纳入灰度超声、CDUS和SWE的三模态策略。利用受试者操作特征（ROC）曲线分析将诊断性能与病理结果进行比较，同时通过分析评估一致性。共评估了106个甲状腺结节；65.1%为良性，34.9%为恶性。在双模态方法中，ChatGPT-4o的ROC曲线下面积（AUC）为66.3%，与病理结果的一致性中等（κ = 0.298），敏感性为70.3%，特异性为62.3%，准确性为65.1%。相反，三模态方法的特异性较高，为97.1%，但敏感性较低，为18.9%，准确性为69.8%，总体一致性降低（κ = 0.194），AUC为58.0%。ChatGPT-4o在使用多模态超声成像对甲状腺结节进行分类方面在一定程度上展现出潜力。然而，双模态方法意外地优于三模态方法。这表明ChatGPT-4o在整合不同数据模态并确定其优先级时可能会遇到挑战，尤其是当存在冲突信息时，这可能会影响诊断效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ded/12249084/583cb0dbe0fa/cancers-17-02068-g002.jpg

相似文献

Exploring the Potential of ChatGPT-4o in Thyroid Nodule Diagnosis Using Multi-Modality Ultrasound Imaging: Dual- vs. Triple-Modality Approaches.利用多模态超声成像探索ChatGPT-4o在甲状腺结节诊断中的潜力：双模态与三模态方法

Cancers (Basel). 2025 Jun 20;17(13):2068. doi: 10.3390/cancers17132068.

Diagnostic Performance of ChatGPT-4o in Detecting Hip Fractures on Pelvic X-rays.ChatGPT-4o在骨盆X光片检测髋部骨折中的诊断性能

Cureus. 2025 Jun 24;17(6):e86654. doi: 10.7759/cureus.86654. eCollection 2025 Jun.

Using a Large Language Model for Breast Imaging Reporting and Data System Classification and Malignancy Prediction to Enhance Breast Ultrasound Diagnosis: Retrospective Study.使用大语言模型进行乳腺影像报告和数据系统分类及恶性肿瘤预测以增强乳腺超声诊断：回顾性研究

JMIR Med Inform. 2025 Jun 11;13:e70924. doi: 10.2196/70924.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Role of Ultrasound Elastography in Evaluating Suspicious Thyroid Nodules.超声弹性成像在评估可疑甲状腺结节中的作用

Indian J Surg Oncol. 2024 Dec;15(4):646-651. doi: 10.1007/s13193-024-01956-4. Epub 2024 May 21.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

ChatGPT-supported patient triage with voice commands in the emergency department: A prospective multicenter study.急诊科中基于语音指令的ChatGPT支持的患者分诊：一项前瞻性多中心研究。

Am J Emerg Med. 2025 Apr 17;94:63-70. doi: 10.1016/j.ajem.2025.04.040.

GPT-4o and Specialized AI in Breast Ultrasound Imaging: A comparative Study on Accuracy, Agreement, Limitations, and Diagnostic Potential.GPT-4o与乳腺超声成像中的专业人工智能：准确性、一致性、局限性及诊断潜力的比较研究

J Ultrasound Med. 2025 Jun 23. doi: 10.1002/jum.16749.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能？开发一种互联网应用算法。

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

Performance of ChatGPT-4o and Four Open-Source Large Language Models in Generating Diagnoses Based on China's Rare Disease Catalog: Comparative Study.ChatGPT-4o与四个开源大语言模型基于中国罕见病目录生成诊断的性能：比较研究

J Med Internet Res. 2025 Jun 18;27:e69929. doi: 10.2196/69929.

本文引用的文献

A multicenter cohort study of thyroidectomy-related decision regret in patients with low-risk papillary thyroid microcarcinoma.一项关于低风险甲状腺微小乳头状癌患者甲状腺切除相关决策后悔的多中心队列研究。

Nat Commun. 2025 Mar 8;16(1):2317. doi: 10.1038/s41467-025-57627-7.

Assessing the feasibility of ChatGPT-4o and Claude 3-Opus in thyroid nodule classification based on ultrasound images.评估ChatGPT-4o和Claude 3-Opus基于超声图像进行甲状腺结节分类的可行性。

Endocrine. 2025 Mar;87(3):1041-1049. doi: 10.1007/s12020-024-04066-x. Epub 2024 Oct 11.

Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook.医疗保健中的多模态大型语言模型：应用、挑战和未来展望。

J Med Internet Res. 2024 Sep 25;26:e59505. doi: 10.2196/59505.

Interactive computer-aided diagnosis on medical image using large language models.使用大语言模型对医学图像进行交互式计算机辅助诊断。

Commun Eng. 2024 Sep 17;3(1):133. doi: 10.1038/s44172-024-00271-8.

From text to image: challenges in integrating vision into ChatGPT for medical image interpretation.从文本到图像：将视觉融入ChatGPT进行医学图像解读面临的挑战。

Neural Regen Res. 2025 Feb 1;20(2):487-488. doi: 10.4103/NRR.NRR-D-24-00165. Epub 2024 Apr 3.

The Potential Applications and Challenges of ChatGPT in the Medical Field.ChatGPT在医学领域的潜在应用与挑战

Int J Gen Med. 2024 Mar 5;17:817-826. doi: 10.2147/IJGM.S456659. eCollection 2024.

Interobserver variability in thyroid ultrasound.甲状腺超声的观察者间变异性。

Endocrine. 2024 Aug;85(2):730-736. doi: 10.1007/s12020-024-03731-5. Epub 2024 Feb 19.

The role of large language models in medical image processing: a narrative review.大语言模型在医学图像处理中的作用：一项叙述性综述。

Quant Imaging Med Surg. 2024 Jan 3;14(1):1108-1121. doi: 10.21037/qims-23-892. Epub 2023 Nov 23.

ChatGPT and Clinical Decision Support: Scope, Application, and Limitations.ChatGPT 与临床决策支持：范围、应用与局限。

Ann Biomed Eng. 2024 May;52(5):1119-1124. doi: 10.1007/s10439-023-03329-4. Epub 2023 Jul 29.

Accuracy of Ultrasound Diagnosis of Benign and Malignant Thyroid Nodules: A Systematic Review and Meta-Analysis.超声诊断甲状腺良恶性结节的准确性：系统评价和荟萃分析。

Int J Clin Pract. 2022 Sep 13;2022:5056082. doi: 10.1155/2022/5056082. eCollection 2022.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用多模态超声成像探索ChatGPT-4o在甲状腺结节诊断中的潜力：双模态与三模态方法

Exploring the Potential of ChatGPT-4o in Thyroid Nodule Diagnosis Using Multi-Modality Ultrasound Imaging: Dual- vs. Triple-Modality Approaches.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献