Department of Computer Science, Ulm University of Applied Science, Albert-Einstein-Allee 55, 89081, Ulm, Baden-Wurttemberg, Germany.
Institute of Databases and Information Systems, Ulm University, James-Franck-Ring, 89081, Ulm, Baden-Wurttemberg, Germany.
Sci Rep. 2023 Jun 6;13(1):9203. doi: 10.1038/s41598-023-36148-7.
In medical imaging, deep learning models can be a critical tool to shorten time-to-diagnosis and support specialized medical staff in clinical decision making. The successful training of deep learning models usually requires large amounts of quality data, which are often not available in many medical imaging tasks. In this work we train a deep learning model on university hospital chest X-ray data, containing 1082 images. The data was reviewed, differentiated into 4 causes for pneumonia, and annotated by an expert radiologist. To successfully train a model on this small amount of complex image data, we propose a special knowledge distillation process, which we call Human Knowledge Distillation. This process enables deep learning models to utilize annotated regions in the images during the training process. This form of guidance by a human expert improves model convergence and performance. We evaluate the proposed process on our study data for multiple types of models, all of which show improved results. The best model of this study, called PneuKnowNet, shows an improvement of + 2.3% points in overall accuracy compared to a baseline model and also leads to more meaningful decision regions. Utilizing this implicit data quality-quantity trade-off can be a promising approach for many scarce data domains beyond medical imaging.
在医学成像中,深度学习模型可以成为缩短诊断时间和支持专业医疗人员临床决策的重要工具。深度学习模型的成功训练通常需要大量高质量的数据,而这些数据在许多医学成像任务中往往不可用。在这项工作中,我们使用包含 1082 张图像的大学医院 X 光数据集来训练深度学习模型。数据经过审查,由专家放射科医生区分出肺炎的 4 种病因,并进行标注。为了成功地在如此少量的复杂图像数据上训练模型,我们提出了一种特殊的知识蒸馏过程,我们称之为人工知识蒸馏。该过程使深度学习模型能够在训练过程中利用图像中的标注区域。这种形式的人工专家指导提高了模型的收敛性和性能。我们在研究数据上对多种类型的模型进行了评估,所有模型都显示出了改进的结果。本研究中最好的模型称为 PneuKnowNet,与基线模型相比,整体准确性提高了+2.3%,并且还导致了更有意义的决策区域。利用这种隐含的数据质量-数量权衡可能是医学成像以外的许多稀缺数据领域的一种很有前途的方法。