Schmidl Benedikt, Hütten Tobias, Pigorsch Steffi, Stögbauer Fabian, Hoch Cosima C, Hussain Timon, Wollenberg Barbara, Wirth Markus
Department of Otolaryngology Head and Neck Surgery, Technical University Munich, Munich, Germany.
Department of RadioOncology, Technical University Munich, Munich, Germany.
Sci Rep. 2025 Jan 29;15(1):3625. doi: 10.1038/s41598-025-85920-4.
Visual diagnosis is one of the key features of squamous cell carcinoma of the oral cavity (OSCC) and oropharynx (OPSCC), both subsets of head and neck squamous cell carcinoma (HNSCC) with a heterogeneous clinical appearance. Advancements in artificial intelligence led to Image recognition being introduced recently into large language models (LLMs) such as ChatGPT 4.0. This exploratory study, for the first time, evaluated the application of image recognition by ChatGPT to diagnose squamous cell carcinoma and leukoplakia based on clinical images, with images without any lesion as a control group. A total of 45 clinical images were analyzed, comprising 15 cases each of SCC, leukoplakia, and non-lesion images. ChatGPT 4.0 was tasked with providing the most likely diagnosis based on these images in scenario one. In scenario two the image and the clinical history were provided, whereas in scenario three only the clinical history was given. The results and the accuracy of the LLM were rated by two independent reviewers and the overall performance was evaluated using the modified Artificial Intelligence Performance Index (AIPI. In this study, ChatGPT 4.0 demonstrated the ability to correctly identify leukoplakia cases using image recognition alone, while the ability to diagnose SCC was insufficient, but improved by including the clinical history in the prompt. Providing only the clinical history resulted in a misclassification of most leukoplakia and some SCC cases. Oral cavity lesions were more likely to be diagnosed correctly. In this exploratory study of 45 images of oral lesions, ChatGPT 4.0 demonstrated a convincing performance for detecting SCC only when the clinical history was added, whereas Leukoplakia was detected solely by image recognition. ChatGPT is therefore currently insufficient for reliable OPSCC and OSCC diagnosis, but further technological advancements may pave the way for the use in the clinical setting.
视觉诊断是口腔鳞状细胞癌(OSCC)和口咽鳞状细胞癌(OPSCC)的关键特征之一,这两者均为头颈部鳞状细胞癌(HNSCC)的子集,临床表现各异。人工智能的发展促使图像识别技术最近被引入到诸如ChatGPT 4.0等大型语言模型(LLM)中。这项探索性研究首次评估了ChatGPT的图像识别技术在基于临床图像诊断鳞状细胞癌和白斑方面的应用,并将无任何病变的图像作为对照组。总共分析了45张临床图像,其中鳞状细胞癌、白斑和无病变图像各15例。在第一种情况下,要求ChatGPT 4.0根据这些图像提供最可能的诊断。在第二种情况下,提供图像和临床病史,而在第三种情况下,只给出临床病史。两位独立的评审员对LLM的结果和准确性进行评分,并使用改良的人工智能性能指数(AIPI)评估整体性能。在本研究中,ChatGPT 4.0展示了仅使用图像识别就能正确识别白斑病例的能力,而诊断鳞状细胞癌的能力不足,但通过在提示中加入临床病史得到了改善。仅提供临床病史会导致大多数白斑和一些鳞状细胞癌病例被错误分类。口腔病变更有可能被正确诊断。在这项对45张口腔病变图像的探索性研究中,ChatGPT 4.0仅在添加临床病史时才表现出令人信服的检测鳞状细胞癌的性能,而白斑仅通过图像识别就能检测到。因此,ChatGPT目前在可靠诊断口咽鳞状细胞癌和口腔鳞状细胞癌方面还不够,但进一步的技术进步可能为其在临床环境中的应用铺平道路。