Zhou Houkui, Ding Qifeng, Chen Chang, Liao Qinqin, Wang Qun, Yu Huimin, Hu Haoji, Zhang Guangqun, Hu Junguo, He Tao
College of Mathematics and Computer Science, Zhejiang A & F University, Hangzhou 311300, China.
Zhejiang Provincial Key Laboratory of Forestry Intelligent Monitoring and Information Technology, Hangzhou 311300, China.
Sensors (Basel). 2025 May 21;25(10):3241. doi: 10.3390/s25103241.
With rapid urbanization, effective waste classification is a critical challenge. Traditional manual methods are time-consuming, labor-intensive, costly, and error-prone, resulting in reduced accuracy. Deep learning has revolutionized this field. Convolutional neural networks such as VGG and ResNet have dramatically improved automated sorting efficiency, and Transformer architectures like the Swin Transformer have further enhanced performance and adaptability in complex sorting scenarios. However, these approaches still struggle in complex environments and with diverse waste types, often suffering from limited recognition accuracy, poor generalization, or prohibitive computational demands. To overcome these challenges, we propose an efficient hybrid-modal fusion method, the Hybrid-modal Fusion Waste Classification Network (HFWC-Net), for precise waste image classification. HFWC-Net leverages a Transformer-based hierarchical architecture that integrates CNNs and Transformers, enhancing feature capture and fusion across varied image types for superior scalability and flexibility. By incorporating advanced techniques such as the Agent Attention mechanism and the LionBatch optimization strategy, HFWC-Net not only improves classification accuracy but also significantly reduces classification time. Comparative experimental results on the public datasets Garbage Classification, TrashNet, and our self-built MixTrash dataset demonstrate that HFWC-Net achieves Top-1 accuracy rates of 98.89%, 96.88%, and 94.35%, respectively. These findings indicate that HFWC-Net attains the highest accuracy among current methods, offering significant advantages in accelerating classification efficiency and supporting automated waste management applications.
随着城市化的快速发展,有效的垃圾分类是一项严峻的挑战。传统的人工方法耗时、费力、成本高且容易出错,导致准确率降低。深度学习给这个领域带来了变革。诸如VGG和ResNet等卷积神经网络极大地提高了自动分类效率,而像Swin Transformer这样的Transformer架构在复杂分类场景中进一步提升了性能和适应性。然而,这些方法在复杂环境和面对多样的垃圾类型时仍然存在困难,常常面临识别准确率有限、泛化能力差或计算需求过高的问题。为了克服这些挑战,我们提出了一种高效的混合模态融合方法,即混合模态融合垃圾分类网络(HFWC-Net),用于精确的垃圾图像分类。HFWC-Net利用基于Transformer的分层架构,将卷积神经网络和Transformer集成在一起,增强了对各种图像类型的特征捕捉和融合,具有卓越的可扩展性和灵活性。通过融入智能注意力机制和LionBatch优化策略等先进技术,HFWC-Net不仅提高了分类准确率,还显著缩短了分类时间。在公共数据集垃圾分类、TrashNet以及我们自建的MixTrash数据集上的对比实验结果表明,HFWC-Net的Top-1准确率分别达到了98.89%、96.88%和94.35%。这些结果表明,HFWC-Net在当前方法中达到了最高的准确率,在加快分类效率和支持自动化垃圾管理应用方面具有显著优势。