Li Ye, Chen Junyu, Brown Justin L, Treves S Ted, Cao Xinhua, Fahey Frederic H, Sgouros George, Bolch Wesley E, Frey Eric C
Johns Hopkins University, Whiting School of Engineering, Department of Electrical and Computer Engineering, Baltimore, Maryland, United States.
Johns Hopkins University, School of Medicine, Russell H. Morgan Department of Radiology and Radiological Science, Baltimore, Maryland, United States.
J Med Imaging (Bellingham). 2021 Jul;8(4):041204. doi: 10.1117/1.JMI.8.4.041204. Epub 2021 Jan 28.
We propose a deep learning-based anthropomorphic model observer (DeepAMO) for image quality evaluation of multi-orientation, multi-slice image sets with respect to a clinically realistic 3D defect detection task. The DeepAMO is developed based on a hypothetical model of the decision process of a human reader performing a detection task using a 3D volume. The DeepAMO is comprised of three sequential stages: defect segmentation, defect confirmation (DC), and rating value inference. The input to the DeepAMO is a composite image, typical of that used to view 3D volumes in clinical practice. The output is a rating value designed to reproduce a human observer's defect detection performance. In stages 2 and 3, we propose: (1) a projection-based DC block that confirms defect presence in two 2D orthogonal orientations and (2) a calibration method that "learns" the mapping from the features of stage 2 to the distribution of observer ratings from the human observer rating data (thus modeling inter- or intraobserver variability) using a mixture density network. We implemented and evaluated the DeepAMO in the context of -DMSA SPECT imaging. A human observer study was conducted, with two medical imaging physics graduate students serving as observers. A -fold cross-validation experiment was conducted to test the statistical equivalence in defect detection performance between the DeepAMO and the human observer. We also compared the performance of the DeepAMO to an unoptimized implementation of a scanning linear discriminant observer (SLDO). The results show that the DeepAMO's and human observer's performances on unseen images were statistically equivalent with a margin of difference ( ) of 0.0426 at , using 288 training images. A limited implementation of an SLDO had a substantially higher AUC (0.99) compared to the DeepAMO and human observer. The results show that the DeepAMO has the potential to reproduce the absolute performance, and not just the relative ranking of human observers on a clinically realistic defect detection task, and that building conceptual components of the human reading process into deep learning-based models can allow training of these models in settings where limited training images are available.
我们提出了一种基于深度学习的拟人化模型观察者(DeepAMO),用于针对临床现实的三维缺陷检测任务,对多方向、多层图像集的图像质量进行评估。DeepAMO是基于人类读者使用三维容积执行检测任务的决策过程的假设模型开发的。DeepAMO由三个连续阶段组成:缺陷分割、缺陷确认(DC)和评级值推断。DeepAMO的输入是一幅合成图像,这是临床实践中用于查看三维容积的典型图像。输出是一个旨在重现人类观察者缺陷检测性能的评级值。在第2和第3阶段,我们提出:(1)一种基于投影的DC模块,用于在两个二维正交方向上确认缺陷的存在;(2)一种校准方法,使用混合密度网络从人类观察者评级数据“学习”从第2阶段的特征到观察者评级分布的映射(从而对观察者间或观察者内的变异性进行建模)。我们在-DMSA SPECT成像的背景下实现并评估了DeepAMO。进行了一项人类观察者研究,两名医学成像物理研究生作为观察者。进行了一次折交叉验证实验,以测试DeepAMO与人类观察者在缺陷检测性能方面的统计等效性。我们还将DeepAMO的性能与扫描线性判别观察者(SLDO)的未优化实现进行了比较。结果表明,使用288幅训练图像时,在时,DeepAMO和人类观察者在未见图像上的性能在统计学上等效,差异幅度()为0.0426。与DeepAMO和人类观察者相比,SLDO的有限实现具有显著更高的AUC(0.99)。结果表明,DeepAMO有潜力重现绝对性能,而不仅仅是人类观察者在临床现实缺陷检测任务中的相对排名,并且将人类阅读过程的概念组件构建到基于深度学习的模型中,可以在可用训练图像有限的环境中对这些模型进行训练。