Kohli Marc D, Summers Ronald M, Geis J Raymond
Radiology and Biomedical Imaging, 505 Parnassus Ave, Moffit-391, San Francisco, CA, 94117, USA.
Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, MD, 20892-1182, USA.
J Digit Imaging. 2017 Aug;30(4):392-399. doi: 10.1007/s10278-017-9976-3.
At the first annual Conference on Machine Intelligence in Medical Imaging (C-MIMI), held in September 2016, a conference session on medical image data and datasets for machine learning identified multiple issues. The common theme from attendees was that everyone participating in medical image evaluation with machine learning is data starved. There is an urgent need to find better ways to collect, annotate, and reuse medical imaging data. Unique domain issues with medical image datasets require further study, development, and dissemination of best practices and standards, and a coordinated effort among medical imaging domain experts, medical imaging informaticists, government and industry data scientists, and interested commercial, academic, and government entities. High-level attributes of reusable medical image datasets suitable to train, test, validate, verify, and regulate ML products should be better described. NIH and other government agencies should promote and, where applicable, enforce, access to medical image datasets. We should improve communication among medical imaging domain experts, medical imaging informaticists, academic clinical and basic science researchers, government and industry data scientists, and interested commercial entities.
在2016年9月举行的首届医学影像机器智能年度会议(C-MIMI)上,一场关于用于机器学习的医学图像数据和数据集的会议讨论了多个问题。与会者的共同主题是,每个参与机器学习医学图像评估的人都面临数据匮乏的问题。迫切需要找到更好的方法来收集、标注和重新利用医学影像数据。医学图像数据集的独特领域问题需要进一步研究、开发并传播最佳实践和标准,需要医学影像领域专家、医学影像信息学家、政府和行业数据科学家以及相关商业、学术和政府实体之间的协同努力。应更好地描述适用于训练、测试、验证、核实和规范机器学习产品的可重复使用医学图像数据集的高级属性。美国国立卫生研究院(NIH)和其他政府机构应促进并在适用情况下强制开放医学图像数据集。我们应改善医学影像领域专家、医学影像信息学家、学术临床和基础科学研究人员、政府和行业数据科学家以及相关商业实体之间的沟通。