Haque Md Inzamam Ul, Dubey Abhishek K, Danciu Ioana, Justice Amy C, Ovchinnikova Olga S, Hinkle Jacob D
University of Tennessee, The Bredesen Center, Knoxville, Tennessee, United States.
Oak Ridge National Laboratory, Computational Sciences and Engineering Division, Oak Ridge, Tennessee, United States.
J Med Imaging (Bellingham). 2023 Jul;10(4):044503. doi: 10.1117/1.JMI.10.4.044503. Epub 2023 Aug 4.
Deep learning (DL) models have received much attention lately for their ability to achieve expert-level performance on the accurate automated analysis of chest X-rays (CXRs). Recently available public CXR datasets include high resolution images, but state-of-the-art models are trained on reduced size images due to limitations on graphics processing unit memory and training time. As computing hardware continues to advance, it has become feasible to train deep convolutional neural networks on high-resolution images without sacrificing detail by downscaling. This study examines the effect of increased resolution on CXR classification performance.
We used the publicly available MIMIC-CXR-JPG dataset, comprising 377,110 high resolution CXR images for this study. We applied image downscaling from native resolution to , , , and and then we used the DenseNet121 and EfficientNet-B4 DL models to evaluate clinical task performance using these four downscaled image resolutions.
We find that while some clinical findings are more reliably labeled using high resolutions, many other findings are actually labeled better using downscaled inputs. We qualitatively verify that tasks requiring a large receptive field are better suited to downscaled low resolution input images, by inspecting effective receptive fields and class activation maps of trained models. Finally, we show that stacking an ensemble across resolutions outperforms each individual learner at all input resolutions while providing interpretable scale weights, indicating that diverse information is extracted across resolutions.
This study suggests that instead of focusing solely on the finest image resolutions, multi-scale features should be emphasized for information extraction from high-resolution CXRs.
深度学习(DL)模型因其在胸部X光(CXR)准确自动分析方面能够达到专家级性能而备受关注。最近可用的公开CXR数据集包含高分辨率图像,但由于图形处理单元内存和训练时间的限制,当前最先进的模型是在缩小尺寸的图像上进行训练的。随着计算硬件的不断进步,在不通过降尺度牺牲细节的情况下,在高分辨率图像上训练深度卷积神经网络已变得可行。本研究考察了分辨率提高对CXR分类性能的影响。
我们使用了公开可用的MIMIC-CXR-JPG数据集,本研究中该数据集包含377,110张高分辨率CXR图像。我们将图像从原始分辨率下采样到 、 、 和 ,然后使用DenseNet121和EfficientNet-B4 DL模型,利用这四种下采样后的图像分辨率评估临床任务性能。
我们发现,虽然使用高分辨率能更可靠地标记一些临床发现,但许多其他发现实际上使用下采样后的输入标记效果更好。通过检查训练模型的有效感受野和类别激活映射,我们定性地验证了需要大感受野的任务更适合下采样后的低分辨率输入图像。最后,我们表明跨分辨率堆叠集成模型在所有输入分辨率下都优于每个单独的学习器,同时提供可解释的尺度权重,这表明跨分辨率提取了不同的信息。
本研究表明,从高分辨率CXR中提取信息时,不应只关注最精细的图像分辨率,而应强调多尺度特征。