Flanders Adam E, Wang Xindi, Wu Carol C, Kitamura Felipe C, Shih George, Mongan John, Peng Yifan
Department of Radiology, Thomas Jefferson University, 132 S Tenth St, Ste 1080 B Main Building, Philadelphia, PA 19107.
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY.
Radiol Artif Intell. 2025 Jul;7(4):e240631. doi: 10.1148/ryai.240631.
Although there are relatively few diverse, high-quality medical imaging datasets on which to train computer vision artificial intelligence models, even fewer datasets contain expertly classified observations that can be repurposed to train or test such models. The traditional annotation process is laborious and time-consuming. Repurposing annotations and consolidating similar types of annotations from disparate sources has never been practical. Until recently, the use of natural language processing to convert a clinical radiology report into labels required custom training of a language model for each use case. Newer technologies such as large language models have made it possible to generate accurate and normalized labels at scale, using only clinical reports and specific prompt engineering. The combination of automatically generated labels extracted and normalized from reports in conjunction with foundational image models provides a means to create labels for model training. This article provides a short history and review of the annotation and labeling process of medical images, from the traditional manual methods to the newest semiautomated methods that provide a more scalable solution for creating useful models more efficiently. Feature Detection, Diagnosis, Semi-supervised Learning © RSNA, 2025.
尽管可用于训练计算机视觉人工智能模型的高质量医学影像数据集相对较少,但包含可用于训练或测试此类模型的经专家分类的观察结果的数据集更少。传统的标注过程既费力又耗时。重新利用标注并整合来自不同来源的类似类型的标注从来都不切实际。直到最近,使用自然语言处理将临床放射学报告转换为标签还需要针对每个用例对语言模型进行定制训练。诸如大语言模型等较新的技术使得仅使用临床报告和特定的提示工程就能大规模生成准确且标准化的标签成为可能。从报告中提取并标准化的自动生成的标签与基础图像模型相结合,为模型训练创建标签提供了一种方法。本文简要回顾了医学图像的标注和标记过程,从传统的手动方法到最新的半自动化方法,这些方法为更高效地创建有用模型提供了更具扩展性的解决方案。特征检测、诊断、半监督学习 © RSNA,2025 年。