AI Department, IT Convergence R &D Center, Vitasoft, Seoul, South Korea.
School of Computer Science and Engineering, Kyungpook National University, Daegu, 41586, South Korea.
Sci Rep. 2023 Mar 3;13(1):3595. doi: 10.1038/s41598-023-30480-8.
Extracting useful features at multiple scales is a crucial task in computer vision. The emergence of deep-learning techniques and the advancements in convolutional neural networks (CNNs) have facilitated effective multiscale feature extraction that results in stable performance improvements in numerous real-life applications. However, currently available state-of-the-art methods primarily rely on a parallel multiscale feature extraction approach, and despite exhibiting competitive accuracy, the models lead to poor results in efficient computation and low generalization on small-scale images. Moreover, efficient and lightweight networks cannot appropriately learn useful features, and this causes underfitting when training with small-scale images or datasets with a limited number of samples. To address these problems, we propose a novel image classification system based on elaborate data preprocessing steps and a carefully designed CNN model architecture. Specifically, we present a consecutive multiscale feature-learning network (CMSFL-Net) that employs a consecutive feature-learning approach based on the usage of various feature maps with different receptive fields to achieve faster training/inference and higher accuracy. In the conducted experiments using six real-life image classification datasets, including small-scale, large-scale, and limited data, the CMSFL-Net exhibits an accuracy comparable with those of existing state-of-the-art efficient networks. Moreover, the proposed system outperforms them in terms of efficiency and speed and achieves the best results in accuracy-efficiency trade-off.
在计算机视觉中,从多个尺度提取有用特征是一项关键任务。深度学习技术的出现和卷积神经网络(CNN)的进步促进了有效的多尺度特征提取,从而在许多实际应用中实现了稳定的性能提升。然而,目前现有的最先进的方法主要依赖于并行多尺度特征提取方法,尽管表现出了竞争准确性,但这些模型在高效计算和小图像的低泛化方面的效果较差。此外,高效和轻量级的网络无法适当地学习有用的特征,这导致在使用小图像或样本数量有限的数据集进行训练时出现欠拟合。为了解决这些问题,我们提出了一种基于精心设计的数据预处理步骤和 CNN 模型架构的新型图像分类系统。具体来说,我们提出了一种连续多尺度特征学习网络(CMSFL-Net),它采用基于使用具有不同感受野的各种特征图的连续特征学习方法,以实现更快的训练/推理和更高的准确性。在使用包括小、大、小数据集的六个真实图像分类数据集进行的实验中,CMSFL-Net 表现出的准确性可与现有的最先进的高效网络相媲美。此外,与其他高效系统相比,该系统在效率和速度方面表现更好,并在准确性-效率权衡中取得了最佳结果。