数据数量和图像缩放对深度学习训练的影响。

Effects of data count and image scaling on Deep Learning training.

作者信息

Hirahara Daisuke, Takaya Eichi, Takahara Taro, Ueda Takuya

机构信息

Department of AI Research Lab, Harada Academy, Kagoshima, Kagoshima, Japan.

School of Science for Open and Environmental Systems, Graduate School of Science and Technology, Keio University, Yokohama, Kanagawa, Japan.

出版信息

PeerJ Comput Sci. 2020 Nov 16;6:e312. doi: 10.7717/peerj-cs.312. eCollection 2020.

DOI:10.7717/peerj-cs.312

PMID:33816963

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7924688/

Abstract

BACKGROUND

Deep learning using convolutional neural networks (CNN) has achieved significant results in various fields that use images. Deep learning can automatically extract features from data, and CNN extracts image features by convolution processing. We assumed that increasing the image size using interpolation methods would result in an effective feature extraction. To investigate how interpolation methods change as the number of data increases, we examined and compared the effectiveness of data augmentation by inversion or rotation with image augmentation by interpolation when the image data for training were small. Further, we clarified whether image augmentation by interpolation was useful for CNN training. To examine the usefulness of interpolation methods in medical images, we used a Gender01 data set, which is a sex classification data set, on chest radiographs. For comparison of image enlargement using an interpolation method with data augmentation by inversion and rotation, we examined the results of two- and four-fold enlargement using a Bilinear method.

RESULTS

The average classification accuracy improved by expanding the image size using the interpolation method. The biggest improvement was noted when the number of training data was 100, and the average classification accuracy of the training model with the original data was 0.563. However, upon increasing the image size by four times using the interpolation method, the average classification accuracy significantly improved to 0.715. Compared with the data augmentation by inversion and rotation, the model trained using the Bilinear method showed an improvement in the average classification accuracy by 0.095 with 100 training data and 0.015 with 50,000 training data. Comparisons of the average classification accuracy of the chest X-ray images showed a stable and high-average classification accuracy using the interpolation method.

CONCLUSION

Training the CNN by increasing the image size using the interpolation method is a useful method. In the future, we aim to conduct additional verifications using various medical images to further clarify the reason why image size is important.

摘要

背景

使用卷积神经网络（CNN）的深度学习在各种图像应用领域取得了显著成果。深度学习能够自动从数据中提取特征，而CNN通过卷积处理来提取图像特征。我们假设使用插值方法增加图像大小会带来有效的特征提取。为了研究随着数据量增加插值方法如何变化，我们在训练图像数据较少时，检验并比较了通过翻转或旋转进行数据增强与通过插值进行图像增强的效果。此外，我们阐明了通过插值进行图像增强对CNN训练是否有用。为了检验插值方法在医学图像中的实用性，我们使用了胸部X光片上的Gender01数据集，这是一个性别分类数据集。为了将使用插值方法进行图像放大与通过翻转和旋转进行数据增强进行比较，我们检验了使用双线性方法进行两倍和四倍放大的结果。

结果

使用插值方法扩大图像大小提高了平均分类准确率。当训练数据数量为100时，提升最为显著，原始数据训练模型的平均分类准确率为0.563。然而，使用插值方法将图像大小增加四倍后，平均分类准确率显著提高到0.715。与通过翻转和旋转进行数据增强相比，使用双线性方法训练的模型在有100个训练数据时平均分类准确率提高了0.095，在有50000个训练数据时提高了0.015。胸部X光图像平均分类准确率的比较表明，使用插值方法具有稳定且较高的平均分类准确率。