Hopson Jessica B, Flaus Anthime, McGinnity Colm J, Neji Radhouene, Reader Andrew J, Hammers Alexander
Department of Biomedical Engineering, King's College London.
King's College London & Guy's and St Thomas' PET Centre, King's College London.
IEEE Trans Radiat Plasma Med Sci. 2024 Nov;8(8):893-901. doi: 10.1109/TRPMS.2024.3436697.
Pretraining deep convolutional network mappings using natural images helps with medical imaging analysis tasks; this is important given the limited number of clinically-annotated medical images. Many two-dimensional pretrained backbone networks, however, are currently available. This work compared 18 different backbones from 5 architecture groups (pretrained on ImageNet) for the task of assessing [F]FDG brain Positron Emission Transmission (PET) image quality (reconstructed at seven simulated doses), based on three clinical image quality metrics (global quality rating, pattern recognition, and diagnostic confidence). Using two-dimensional randomly sampled patches, up to eight patients (at three dose levels each) were used for training, with three separate patient datasets used for testing. Each backbone was trained five times with the same training and validation sets, and with six cross-folds. Training only the final fully connected layer (with 6,000-20,000 trainable parameters) achieved a test mean-absolute-error of ~0.5 (which was within the intrinsic uncertainty of clinical scoring). To compare "classical" and over-parameterized regimes, the pretrained weights of the last 40% of the network layers were then unfrozen. The mean-absolute-error fell below 0.5 for 14 out of the 18 backbones assessed, including two that previously failed to train. Generally, backbones with residual units (e.g. DenseNets and ResNetV2s), were suited to this task, in terms of achieving the lowest mean-absolute-error at test time (0.45 - 0.5). This proof-of-concept study shows that over-parameterization may also be important for automated PET image quality assessments.
使用自然图像预训练深度卷积网络映射有助于医学成像分析任务;考虑到临床标注的医学图像数量有限,这一点很重要。然而,目前有许多二维预训练主干网络可用。这项工作比较了来自5个架构组(在ImageNet上预训练)的18种不同主干,用于评估[F]FDG脑正电子发射断层扫描(PET)图像质量(在七种模拟剂量下重建)的任务,基于三个临床图像质量指标(整体质量评级、模式识别和诊断置信度)。使用二维随机采样补丁,最多八名患者(每个剂量水平三名)用于训练,三个单独的患者数据集用于测试。每个主干使用相同的训练集和验证集进行五次训练,并进行六次交叉折叠。仅训练最终的全连接层(约6000 - 20000个可训练参数)可实现约0.5的测试平均绝对误差(这在临床评分的固有不确定性范围内)。为了比较“经典”和过参数化模式,然后解冻网络层最后40%的预训练权重。在所评估的18个主干中,有14个的平均绝对误差降至0.5以下,包括两个之前未能训练成功的。一般来说,具有残差单元的主干(如密集连接网络和ResNetV2)在测试时实现了最低的平均绝对误差(约0.45 - 0.5),适合这项任务。这项概念验证研究表明,过参数化对于自动化PET图像质量评估也可能很重要。