Suppr超能文献

基于大语言模型的多模态系统,用于从智能手机图像中检测和分级眼表疾病。

Large language model-based multimodal system for detecting and grading ocular surface diseases from smartphone images.

作者信息

Li Zhongwen, Wang Zhouqian, Xiu Liheng, Zhang Pengyao, Wang Wenfang, Wang Yangyang, Chen Gang, Yang Weihua, Chen Wei

机构信息

Ningbo Key Laboratory of Medical Research on Blinding Eye Diseases, Ningbo Eye Institute, Ningbo Eye Hospital, Wenzhou Medical University, Ningbo, China.

National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, China.

出版信息

Front Cell Dev Biol. 2025 May 23;13:1600202. doi: 10.3389/fcell.2025.1600202. eCollection 2025.

Abstract

BACKGROUND

The development of medical artificial intelligence (AI) models is primarily driven by the need to address healthcare resource scarcity, particularly in underserved regions. Proposing an affordable, accessible, interpretable, and automated AI system for non-clinical settings is crucial to expanding access to quality healthcare.

METHODS

This cross-sectional study developed the Multimodal Ocular Surface Assessment and Interpretation Copilot (MOSAIC) using three multimodal large language models: gpt-4-turbo, claude-3-opus, and gemini-1.5-pro-latest, for detecting three ocular surface diseases (OSDs) and grading keratitis and pterygium. A total of 375 smartphone-captured ocular surface images collected from 290 eyes were utilized to validate MOSAIC. The performance of MOSAIC was evaluated in both zero-shot and few-shot settings, with tasks including image quality control, OSD detection, analysis of the severity of keratitis, and pterygium grading. The interpretability of the system was also evaluated.

RESULTS

MOSAIC achieved 95.00% accuracy in image quality control, 86.96% in OSD detection, 88.33% in distinguishing mild from severe keratitis, and 66.67% in determining pterygium grades with five-shot settings. The performance significantly improved with the increasing learning shots (p < 0.01). The system attained high ROUGE-L F1 scores of 0.70-0.78, depicting its interpretable image comprehension capability.

CONCLUSION

MOSAIC exhibited exceptional few-shot learning capabilities, achieving high accuracy in OSD management with minimal training examples. This system has significant potential for smartphone integration to enhance the accessibility and effectiveness of OSD detection and grading in resource-limited settings.

摘要

背景

医学人工智能(AI)模型的发展主要是由解决医疗资源稀缺问题的需求驱动的,特别是在服务不足的地区。提出一个价格合理、易于使用、可解释且自动化的非临床环境AI系统对于扩大优质医疗服务的可及性至关重要。

方法

这项横断面研究使用三种多模态大语言模型:gpt-4-turbo、claude-3-opus和gemini-1.5-pro-latest,开发了多模态眼表评估与解读助手(MOSAIC),用于检测三种眼表疾病(OSD)以及对角膜炎和翼状胬肉进行分级。总共利用从290只眼睛收集的375张智能手机拍摄的眼表图像来验证MOSAIC。在零样本和少样本设置下评估MOSAIC的性能,任务包括图像质量控制、OSD检测、角膜炎严重程度分析和翼状胬肉分级。还评估了该系统的可解释性。

结果

在五样本设置下,MOSAIC在图像质量控制方面的准确率达到95.00%,在OSD检测方面达到86.96%,在区分轻度与重度角膜炎方面达到88.33%,在确定翼状胬肉分级方面达到66.67%。随着学习样本数量的增加,性能显著提高(p < 0.01)。该系统获得了0.70 - 0.78的高ROUGE-L F1分数,表明其具有可解释的图像理解能力。

结论

MOSAIC展现出卓越的少样本学习能力,在极少的训练样本情况下,在OSD管理方面实现了高精度。该系统在与智能手机集成方面具有巨大潜力,可提高资源有限环境下OSD检测和分级的可及性和有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd9/12141289/92f1687b7a22/fcell-13-1600202-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验