Center for Medical Image Science and Visualization (CMIV), Linköping University Hospital, Linköping University, SE-581 85, Linköping, Sweden.
Department of Health, Medicine and Caring Sciences (HMV), Linköping University, SE-581 85, Linköping, Sweden.
J Digit Imaging. 2021 Feb;34(1):105-115. doi: 10.1007/s10278-020-00384-4.
Artificial intelligence (AI) holds much promise for enabling highly desired imaging diagnostics improvements. One of the most limiting bottlenecks for the development of useful clinical-grade AI models is the lack of training data. One aspect is the large amount of cases needed and another is the necessity of high-quality ground truth annotation. The aim of the project was to establish and describe the construction of a database with substantial amounts of detail-annotated oncology imaging data from pathology and radiology. A specific objective was to be proactive, that is, to support undefined subsequent AI training across a wide range of tasks, such as detection, quantification, segmentation, and classification, which puts particular focus on the quality and generality of the annotations. The main outcome of this project was the database as such, with a collection of labeled image data from breast, ovary, skin, colon, skeleton, and liver. In addition, this effort also served as an exploration of best practices for further scalability of high-quality image collections, and a main contribution of the study was generic lessons learned regarding how to successfully organize efforts to construct medical imaging databases for AI training, summarized as eight guiding principles covering team, process, and execution aspects.
人工智能(AI)在实现高度期望的成像诊断改进方面具有很大的潜力。开发有用的临床级 AI 模型的最具限制性的瓶颈之一是缺乏训练数据。一方面是需要大量的病例,另一方面是高质量的地面实况注释的必要性。该项目的目的是建立和描述一个数据库的构建,该数据库包含大量详细注释的肿瘤学成像数据,来自病理学和放射学。一个具体的目标是积极主动,也就是说,支持未定义的后续 AI 培训,涵盖广泛的任务,如检测、量化、分割和分类,这特别关注注释的质量和通用性。该项目的主要成果是这样的数据库,其中包含来自乳房、卵巢、皮肤、结肠、骨骼和肝脏的标记图像数据。此外,这项工作还探索了进一步扩展高质量图像集的最佳实践,该研究的主要贡献是关于如何成功组织用于 AI 训练的医学成像数据库构建工作的通用经验教训,总结为涵盖团队、流程和执行方面的八项指导原则。