Güneş Ayetullah Mehdi, van Rooij Ward, Gulshad Sadaf, Slotman Ben, Dahele Max, Verbakel Wilko
Department of Radiation Oncology, Amsterdam UMC, Amsterdam, The Netherlands.
Faculty of Science, Universiteit van Amsterdam, Amsterdam, The Netherlands.
Med Phys. 2023 Oct;50(10):6421-6432. doi: 10.1002/mp.16437. Epub 2023 Apr 29.
Clinical data used to train deep learning models are often not clean data. They can contain imperfections in both the imaging data and the corresponding segmentations.
This study investigates the influence of data imperfections on the performance of deep learning models for parotid gland segmentation. This was done in a controlled manner by using synthesized data. The insights this study provides may be used to make deep learning models better and more reliable.
The data were synthesized by using the clinical segmentations, creating a pseudo ground-truth in the process. Three kinds of imperfections were simulated: incorrect segmentations, low image contrast, and artifacts in the imaging data. The severity of each imperfection was varied in five levels. Models resulting from training sets from each of the five levels were cross-evaluated with test sets from each of the five levels.
Using synthesized data led to almost perfect parotid gland segmentation when no error was added. Lowering the quality of the parotid gland segmentations used for training substantially lowered the model performance. Additionally, lowering the image quality of the training data by decreasing the contrast or introducing artifacts made the resulting models more robust to data containing those respective kinds of data imperfection.
This study demonstrated the importance of good-quality segmentations for deep learning training and it shows that using low-quality imaging data for training can enhance the robustness of the resulting models.
用于训练深度学习模型的临床数据往往并非干净的数据。它们在成像数据和相应的分割结果中都可能存在缺陷。
本研究调查数据缺陷对腮腺分割深度学习模型性能的影响。通过使用合成数据以可控的方式进行此项研究。本研究提供的见解可用于使深度学习模型更优、更可靠。
利用临床分割结果合成数据,在此过程中创建一个伪真值。模拟了三种缺陷:分割错误、图像对比度低以及成像数据中的伪影。每种缺陷的严重程度分为五个等级。对来自五个等级中每个等级的训练集所得到的模型,与来自五个等级中每个等级的测试集进行交叉评估。
在不添加错误的情况下,使用合成数据可实现几乎完美的腮腺分割。降低用于训练的腮腺分割的质量会大幅降低模型性能。此外,通过降低对比度或引入伪影来降低训练数据的图像质量,会使所得模型对包含相应类型数据缺陷的数据更具鲁棒性。
本研究证明了高质量分割对于深度学习训练的重要性,并且表明使用低质量成像数据进行训练可增强所得模型的鲁棒性。