Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA.
Indian Institute of Technology, Madras, Chennai, India, 600036.
J Digit Imaging. 2022 Oct;35(5):1143-1152. doi: 10.1007/s10278-022-00644-5. Epub 2022 May 13.
Image classification is probably the most fundamental task in radiology artificial intelligence. To reduce the burden of acquiring and labeling data sets, we employed a two-pronged strategy. We automatically extracted labels from radiology reports in Part 1. In Part 2, we used the labels to train a data-efficient reinforcement learning (RL) classifier. We applied the approach to a small set of patient images and radiology reports from our institution. For Part 1, we trained sentence-BERT (SBERT) on 90 radiology reports. In Part 2, we used the labels from the trained SBERT to train an RL-based classifier. We trained the classifier on a training set of [Formula: see text] images. We tested on a separate collection of [Formula: see text] images. For comparison, we also trained and tested a supervised deep learning (SDL) classification network on the same set of training and testing images using the same labels. Part 1: The trained SBERT model improved from 82 to [Formula: see text] accuracy. Part 2: Using Part 1's computed labels, SDL quickly overfitted the small training set. Whereas SDL showed the worst possible testing set accuracy of 50%, RL achieved [Formula: see text] testing set accuracy, with a [Formula: see text]-value of [Formula: see text]. We have shown the proof-of-principle application of automated label extraction from radiological reports. Additionally, we have built on prior work applying RL to classification using these labels, extending from 2D slices to entire 3D image volumes. RL has again demonstrated a remarkable ability to train effectively, in a generalized manner, and based on small training sets.
图像分类可能是放射人工智能中最基本的任务。为了减轻获取和标记数据集的负担,我们采用了双管齐下的策略。我们在第 1 部分自动从放射学报告中提取标签。在第 2 部分,我们使用这些标签来训练高效的数据强化学习 (RL) 分类器。我们将该方法应用于我们机构的一小部分患者图像和放射学报告。对于第 1 部分,我们在 90 份放射学报告上训练了句子-BERT (SBERT)。在第 2 部分,我们使用训练后的 SBERT 标签来训练基于 RL 的分类器。我们在训练集上训练分类器[Formula: see text] 图像。我们在单独的[Formula: see text] 图像集合上进行测试。作为比较,我们还使用相同的标签在相同的训练和测试图像集上训练和测试了基于监督的深度学习 (SDL) 分类网络。第 1 部分:经过训练的 SBERT 模型的准确率从 82%提高到[Formula: see text]%。第 2 部分:使用第 1 部分计算出的标签,SDL 很快对小型训练集过度拟合。而 SDL 的测试集准确率最差,仅为 50%,而 RL 的测试集准确率达到了[Formula: see text]%,[Formula: see text]-值为[Formula: see text]。我们已经证明了从放射学报告中自动提取标签的原理应用。此外,我们还基于应用 RL 使用这些标签进行分类的先前工作,从 2D 切片扩展到整个 3D 图像体积。RL 再次证明了其以通用方式基于小训练集进行有效训练的非凡能力。