Ruijsink Bram, Puyol-Antón Esther, Li Ye, Bai Wenja, Kerfoot Eric, Razavi Reza, King Andrew P
School of Biomedical Engineering & Imaging Sciences, King's College London, UK.
St Thomas' Hospital NHS Foundation Trust, London, UK.
Stat Atlases Comput Models Heart. 2020;2020:97-107. doi: 10.1007/978-3-030-68107-4_10. Epub 2021 Jan 29.
One of the challenges in developing deep learning algorithms for medical image segmentation is the scarcity of annotated training data. To overcome this limitation, data augmentation and semi-supervised learning (SSL) methods have been developed. However, these methods have limited effectiveness as they either exploit the existing data set only (data augmentation) or risk negative impact by adding poor training examples (SSL). Segmentations are rarely the final product of medical image analysis -they are typically used in downstream tasks to infer higher-order patterns to evaluate diseases. Clinicians take into account a wealth of prior knowledge on biophysics and physiology when evaluating image analysis results. We have used these clinical assessments in previous works to create robust quality-control (QC) classifiers for automated cardiac magnetic resonance (CMR) analysis. In this paper, we propose a novel scheme that uses QC of the downstream task to identify high quality outputs of CMR segmentation networks, that are subsequently utilised for further network training. In essence, this provides quality-aware augmentation of training data in a variant of SSL for segmentation networks (semiQCSeg). We evaluate our approach in two CMR segmentation tasks (aortic and short axis cardiac volume segmentation) using UK Biobank data and two commonly used network architectures (U-net and a Fully Convolutional Network) and compare against supervised and SSL strategies. We show that semiQCSeg improves training of the segmentation networks. It decreases the need for labelled data, while outperforming the other methods in terms of Dice and clinical metrics. SemiQCSeg can be an efficient approach for training segmentation networks for medical image data when labelled datasets are scarce.
开发用于医学图像分割的深度学习算法面临的挑战之一是标注训练数据的稀缺性。为了克服这一限制,人们开发了数据增强和半监督学习(SSL)方法。然而,这些方法的有效性有限,因为它们要么仅利用现有数据集(数据增强),要么因添加质量不佳的训练示例而有产生负面影响的风险(SSL)。分割很少是医学图像分析的最终产物——它们通常用于下游任务,以推断高阶模式来评估疾病。临床医生在评估图像分析结果时会考虑大量关于生物物理学和生理学的先验知识。我们在之前的工作中利用这些临床评估为自动心脏磁共振(CMR)分析创建了强大的质量控制(QC)分类器。在本文中,我们提出了一种新颖的方案,该方案利用下游任务的QC来识别CMR分割网络的高质量输出,随后将这些输出用于进一步的网络训练。本质上,这在用于分割网络的SSL变体(semiQCSeg)中提供了具有质量意识的训练数据增强。我们使用英国生物银行数据和两种常用的网络架构(U-net和全卷积网络)在两个CMR分割任务(主动脉和短轴心脏容积分割)中评估了我们的方法,并与监督和SSL策略进行了比较。我们表明,semiQCSeg改进了分割网络的训练。它减少了对标记数据的需求,同时在骰子系数和临床指标方面优于其他方法。当标记数据集稀缺时,semiQCSeg可以是一种用于训练医学图像数据分割网络的有效方法。