文献检索，用中文搜 PubMed

Clinicians spend significant time reviewing medical images and transcribing findings. By integrating visual and textual data, foundation models have the potential to reduce workloads and boost efficiency, yet their practical clinical value remains uncertain. In this study, we find that OpenAI's ChatGPT-4o and two medical vision-language models (VLMs) significantly underperform ophthalmologists in key tasks for age-related macular degeneration (AMD). To address this, we developed a dedicated training curriculum, designed by domain specialists, to optimize VLMs for tasks related to clinical decision making. The resulting model, RetinaVLM-Specialist, significantly outperforms foundation medical VLMs and ChatGPT-4o in AMD disease staging (F1: 0.63 vs. 0.33) and referral (0.67 vs. 0.50), achieving performance comparable to junior ophthalmologists. In a reader study, two senior ophthalmologists confirmed that RetinaVLM's reports were substantially more accurate than those written by ChatGPT-4o (64.3% vs. 14.3%). Overall, our curriculum-based approach offers a blueprint for adapting foundation models to real-world medical applications.

临床医生花费大量时间查看医学影像并记录检查结果。通过整合视觉和文本数据，基础模型有潜力减轻工作量并提高效率，但其实际临床价值仍不确定。在本研究中，我们发现OpenAI的ChatGPT-4o和两个医学视觉语言模型（VLM）在年龄相关性黄斑变性（AMD）的关键任务中表现明显不如眼科医生。为解决这一问题，我们开发了由领域专家设计的专门培训课程，以优化VLM用于与临床决策相关的任务。由此产生的模型RetinaVLM-Specialist在AMD疾病分期（F1：0.63对0.33）和转诊（0.67对0.50）方面显著优于基础医学VLM和ChatGPT-4o，其表现与初级眼科医生相当。在一项读者研究中，两位资深眼科医生证实RetinaVLM的报告比ChatGPT-4o编写的报告准确得多（64.3%对14.3%）。总体而言，我们基于课程的方法为使基础模型适应实际医疗应用提供了一个蓝图。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于视网膜图像分析中视觉语言模型训练的专业课程。

Specialized curricula for training vision language models in retinal image analysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献