使用多模态大语言模型对色素性皮肤病变进行多类别分类。

Multiclass Classification of Pigmented Skin Lesions Using a Multimodal Large Language Model.

作者信息

Iinuma Kimi, Fujii Kazuyasu, Nakashima Chisa, Kasai Kenichiro, Irie Hiroyuki, Kanetomo Hitonari, Yanagihara Shigeto, Sato Sayuri, Uhara Hisashi, Takeda Fumiaki, Otsuka Atsushi

机构信息

Dermatology, Kindai University Hospital, Osaka, JPN.

Plastic Surgery, Kasai Clinic for Plastic Surgery, Osaka, JPN.

出版信息

Cureus. 2025 Jul 24;17(7):e88711. doi: 10.7759/cureus.88711. eCollection 2025 Jul.

DOI:10.7759/cureus.88711

PMID:40861687

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12375176/

Abstract

BACKGROUND

Pigmented skin lesions span benign to malignant entities that often appear similar on standard clinical photographs, complicating accurate diagnosis without specialized imaging. Recently, multimodal large language models (MMLLMs) have attracted attention as image-based diagnostic aids and hold promise as decision-support tools in resource-limited settings where dermoscopy may be unavailable.

OBJECTIVES

This study aimed to determine whether a fine-tuned MMLLM can accurately classify eight common pigmented skin conditions using only clinical photographs, thereby providing a non-dermoscopic diagnostic support tool.

METHODS

We fine-tuned InstructBLIP-flan-t5-xl (Salesforce, San Francisco, CA) using Hugging Face's Seq2SeqTrainer (Hugging Face Inc., New York City, NY) on a curated dataset of 979 manually cropped regions of interest depicting one of eight lesion types (acquired dermal melanocytosis, basal cell carcinoma, ephelis, malignant melanoma, melasma, nevus, seborrheic keratosis, or solar lentigo). Images were split 80% for training and 20% for validation. During training, lesion labels were masked to encourage learning of visual-text correlations. Model performance was evaluated by macro-average sensitivity, specificity, F1 score, and area under the receiver operating characteristic area under the curve (ROC AUC) for each class.

RESULTS

On the validation set, the model achieved a macro-average sensitivity of 86.0%, specificity of 98.2%, and F1 score of 0.86. ROC AUC exceeded 0.95 for six of eight classes. Malignant melanoma showed the highest performance (sensitivity 94%, ROC AUC 0.98), while nevus exhibited the lowest sensitivity (78%, ROC AUC 0.89).

CONCLUSIONS

Fine-tuned MMLLMs can accurately classify common pigmented skin lesions from clinical photographs alone, enabling rapid diagnostic support in environments lacking dermoscopy. Future work should expand dataset diversity, undertake multicenter validation, and assess real-world clinical utility to confirm broader applicability.

摘要

背景

色素沉着性皮肤病变涵盖了从良性到恶性的多种类型，在标准临床照片上它们通常看起来相似，这使得在没有专业成像技术的情况下准确诊断变得复杂。最近，多模态大语言模型（MMLLMs）作为基于图像的诊断辅助工具受到了关注，并有望在资源有限且可能无法进行皮肤镜检查的环境中作为决策支持工具。

目的

本研究旨在确定经过微调的MMLLM是否能够仅使用临床照片准确分类八种常见的色素沉着性皮肤疾病，从而提供一种非皮肤镜诊断支持工具。

方法

我们使用Hugging Face的Seq2SeqTrainer（Hugging Face公司，纽约市）在一个精心策划的数据集上对InstructBLIP-flan-t5-xl（Salesforce，旧金山，加利福尼亚州）进行微调，该数据集包含979个手动裁剪的感兴趣区域，描绘了八种病变类型之一（后天性真皮黑素细胞增多症、基底细胞癌、雀斑、恶性黑色素瘤、黄褐斑、痣、脂溢性角化病或日光性雀斑样痣）。图像按80%用于训练和20%用于验证进行划分。在训练过程中，病变标签被屏蔽以促进视觉-文本相关性的学习。通过每个类别的宏观平均灵敏度、特异性、F1分数以及受试者操作特征曲线下面积（ROC AUC）来评估模型性能。

结果

在验证集上，该模型的宏观平均灵敏度为86.0%，特异性为98.2%，F1分数为0.86。八个类别中有六个类别的ROC AUC超过0.95。恶性黑色素瘤表现出最高的性能（灵敏度94%，ROC AUC 0.98），而痣的灵敏度最低（78%，ROC AUC 0.89）。

结论

经过微调的MMLLMs能够仅根据临床照片准确分类常见的色素沉着性皮肤病变，从而在缺乏皮肤镜检查的环境中提供快速诊断支持。未来的工作应扩大数据集的多样性，进行多中心验证，并评估实际临床效用以确认更广泛的适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9992/12375176/b308de367239/cureus-0017-00000088711-i01.jpg

相似文献

Multiclass Classification of Pigmented Skin Lesions Using a Multimodal Large Language Model.使用多模态大语言模型对色素性皮肤病变进行多类别分类。

Cureus. 2025 Jul 24;17(7):e88711. doi: 10.7759/cureus.88711. eCollection 2025 Jul.

Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.利用晚期癌症患者腹部和骨盆 CT 图像建立卷积神经网络模型预测股骨近端病理性骨折的研究

Clin Orthop Relat Res. 2023 Nov 1;481(11):2247-2256. doi: 10.1097/CORR.0000000000002771. Epub 2023 Aug 23.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

Development of a Transfer Learning-Based, Multimodal Neural Network for Identifying Malignant Dermatological Lesions From Smartphone Images.基于迁移学习的多模态神经网络用于从智能手机图像识别恶性皮肤病变的开发。

Cancer Inform. 2025 Jun 24;24:11769351251349891. doi: 10.1177/11769351251349891. eCollection 2025.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

VivaScope® 1500 and 3000 systems for detecting and monitoring skin lesions: a systematic review and economic evaluation.用于检测和监测皮肤病变的VivaScope® 1500和3000系统：系统评价与经济评估

Health Technol Assess. 2016 Jul;20(58):1-260. doi: 10.3310/hta20580.

Methylation-based test for diagnosis of benign and malignant melanocytoma.基于甲基化的良性和恶性黑素细胞瘤诊断检测

Br J Dermatol. 2025 Aug 18;193(3):480-489. doi: 10.1093/bjd/ljaf169.

Are Artificial Intelligence Models Reliable for Clinical Application in Pediatric Fracture Detection on Radiographs? A Systematic Review and Meta-analysis.人工智能模型在儿科骨折X线片检测中的临床应用是否可靠？一项系统评价和荟萃分析。

Clin Orthop Relat Res. 2025 Aug 20. doi: 10.1097/CORR.0000000000003660.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

本文引用的文献

Prospective multicenter study using artificial intelligence to improve dermoscopic melanoma diagnosis in patient care.使用人工智能改善患者护理中皮肤镜检查黑色素瘤诊断的前瞻性多中心研究。

Commun Med (Lond). 2024 Sep 11;4(1):177. doi: 10.1038/s43856-024-00598-5.

A medical multimodal large language model for future pandemics.用于应对未来大流行的医学多模态大语言模型。

NPJ Digit Med. 2023 Dec 2;6(1):226. doi: 10.1038/s41746-023-00952-2.

Artificial Intelligence in Skin Cancer Diagnosis: A Reality Check.人工智能在皮肤癌诊断中的应用：现实情况分析。

J Invest Dermatol. 2024 Mar;144(3):492-499. doi: 10.1016/j.jid.2023.10.004. Epub 2023 Nov 18.

Deep Learning in Dermatology: A Systematic Review of Current Approaches, Outcomes, and Limitations.皮肤病学中的深度学习：当前方法、成果及局限性的系统综述

JID Innov. 2022 Aug 23;3(1):100150. doi: 10.1016/j.xjidi.2022.100150. eCollection 2023 Jan.

Checklist for Evaluation of Image-Based Artificial Intelligence Reports in Dermatology: CLEAR Derm Consensus Guidelines From the International Skin Imaging Collaboration Artificial Intelligence Working Group.皮肤科基于图像人工智能报告评估清单：来自国际皮肤成像协作人工智能工作组的 CLEAR Derm 共识指南。

JAMA Dermatol. 2022 Jan 1;158(1):90-96. doi: 10.1001/jamadermatol.2021.4915.

Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.《全球癌症统计数据 2020：全球 185 个国家和地区 36 种癌症的发病率和死亡率估计》。

CA Cancer J Clin. 2021 May;71(3):209-249. doi: 10.3322/caac.21660. Epub 2021 Feb 4.

The impact of patient clinical information on automated skin cancer detection.患者临床信息对自动化皮肤癌检测的影响。

Comput Biol Med. 2020 Jan;116:103545. doi: 10.1016/j.compbiomed.2019.103545. Epub 2019 Nov 18.

Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm.基于深度学习算法的良性和恶性皮肤肿瘤临床图像分类。

J Invest Dermatol. 2018 Jul;138(7):1529-1538. doi: 10.1016/j.jid.2018.01.028. Epub 2018 Feb 8.

Dermatologist-level classification of skin cancer with deep neural networks.基于深度神经网络的皮肤癌皮肤科医生级分类。

Nature. 2017 Feb 2;542(7639):115-118. doi: 10.1038/nature21056. Epub 2017 Jan 25.

Association of Multiple Melanocytic Naevi with Education, Sex and Skin Type. A Northern Finland Birth Cohort 1966 Study with 46 Years Follow-up.多发性黑素细胞痣与教育程度、性别及皮肤类型的关联。一项对1966年芬兰北部出生队列进行46年随访的研究。

Acta Derm Venereol. 2017 Feb 8;97(2):219-224. doi: 10.2340/00015555-2509.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用多模态大语言模型对色素性皮肤病变进行多类别分类。

Multiclass Classification of Pigmented Skin Lesions Using a Multimodal Large Language Model.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献