Li Junyuan, Wang Wenying, Tivnan Matthew, Stayman J Webster, Gang Grace J
Department of Biomedical Engineering, Johns Hopkins University, Baltimore MD, USA 21205.
Proc SPIE Int Soc Opt Eng. 2022 Feb-Mar;12031. doi: 10.1117/12.2612732. Epub 2022 Apr 4.
The proliferation of deep learning image processing calls for a quantitative image quality assessment framework that is suitable for nonlinear, data-dependent algorithms. In this work, we propose a method to systematically evaluate the system and noise responses such that the nonlinear transfer properties can be mapped out. The method involves sampling of lesion perturbations as a function of size, contrast, as well as clinically relevant features such as shape and texture that may be important for diagnosis. We embed the perturbations in backgrounds of varying attenuation levels, noise magnitude and correlation that are associated with different patient anatomies and imaging protocols. The range of system and noise response are further used to evaluate performance for clinical tasks such as signal detection and classification. We performed the assessment for an example CNN-denoising algorithm for low does lung CT screening. The system response of the CNN-denoising algorithm exhibits highly nonlinear behavior where both contrast and higher order lesion features such as spiculated boundaries are not reliably represented for lesions perturbations with small size and low contrast. The noise properties are potentially highly nonstationary, and should be assumed to be the same between the signal-present and signal-absent images. Furthermore, we observer a high degree dependency of both system and noise response on the background attenuation levels. Inputs around zeros are effectively imposed a non-negativity constraint; transfer properties for higher background levels are highly variable. For a detection task, CNN-denoised images improved detectability index by 16-18% compared to low dose CT inputs. For classification task between spiculated and smooth lesions, CNN-denoised images result in a much larger improvement up to 50%. The performance assessment framework propose in this work can systematically map out the nonlinear transfer functions for deep learning algorithms and can potentially enable robust deployment of such algorithms in medical imaging applications.
深度学习图像处理的激增需要一个适用于非线性、数据依赖算法的定量图像质量评估框架。在这项工作中,我们提出了一种系统评估系统和噪声响应的方法,以便能够描绘出非线性传递特性。该方法涉及对病变扰动进行采样,采样依据病变大小、对比度以及形状和纹理等临床相关特征(这些特征可能对诊断很重要)。我们将扰动嵌入到与不同患者解剖结构和成像协议相关的不同衰减水平、噪声幅度和相关性的背景中。系统和噪声响应范围进一步用于评估信号检测和分类等临床任务的性能。我们针对一种用于低剂量肺部CT筛查的示例CNN去噪算法进行了评估。CNN去噪算法的系统响应呈现出高度非线性行为,对于小尺寸和低对比度的病变扰动,对比度和诸如毛刺状边界等高阶病变特征都无法可靠地呈现出来。噪声特性可能高度非平稳,并且应假定在有信号和无信号的图像之间是相同的。此外,我们观察到系统和噪声响应都高度依赖于背景衰减水平。接近零的输入有效地施加了非负约束;较高背景水平的传递特性变化很大。对于检测任务,与低剂量CT输入相比,CNN去噪图像将可检测性指数提高了16 - 18%。对于毛刺状和平滑病变之间的分类任务,CNN去噪图像带来了高达50%的更大改进。这项工作中提出的性能评估框架可以系统地描绘深度学习算法的非线性传递函数,并有可能在医学成像应用中实现此类算法的稳健部署。