Li Sheng, Chang Zhousheng, Liu Haizhen
Innovation and Entrepreneurship Institute, Guangxi Normal University, Guilin, China.
Doctoral College, University for the Creative Arts, Epsom, United Kingdom.
PLoS One. 2025 May 14;20(5):e0322836. doi: 10.1371/journal.pone.0322836. eCollection 2025.
In the field of computer vision, the task of image annotation and classification has attracted much attention due to its wide demand in applications such as medical image analysis, intelligent surveillance, and image retrieval. However, existing methods have significant limitations in dealing with unknown target domain data, which are manifested in the problems of reduced classification accuracy and insufficient generalization ability. To this end, the study proposes an adaptive image annotation classification model for open-set domains based on dynamic threshold control and subdomain alignment strategy to address the impact of the difference between the source and target domain distributions on the classification performance. The model combines the channel attention mechanism to dynamically extract important features, optimizes the cross-domain feature alignment effect using dynamic weight adjustment and subdomain alignment strategy, and balances the classification performance of known and unknown categories by dynamic threshold control. The experiments are conducted on ImageNet and COCO datasets, and the results show that the proposed model has a classification accuracy of up to 93.5% in the unknown target domain and 89.6% in the known target domain, which is better than the best results of existing methods. Meanwhile, the model check accuracy and recall rate reach up to 89.6% and 90.7%, respectively, and the classification time is only 1.2 seconds, which significantly improves the classification accuracy and efficiency. It is shown that the method can effectively improve the robustness and generalization ability of the image annotation and classification task in open-set scenarios, and provides a new idea for solving the domain adaptation problem in real scenarios.
在计算机视觉领域,图像标注和分类任务因其在医学图像分析、智能监控和图像检索等应用中的广泛需求而备受关注。然而,现有方法在处理未知目标域数据时存在显著局限性,表现为分类准确率降低和泛化能力不足等问题。为此,该研究提出了一种基于动态阈值控制和子域对齐策略的开放集域自适应图像标注分类模型,以解决源域和目标域分布差异对分类性能的影响。该模型结合通道注意力机制动态提取重要特征,利用动态权重调整和子域对齐策略优化跨域特征对齐效果,并通过动态阈值控制平衡已知和未知类别的分类性能。在ImageNet和COCO数据集上进行的实验结果表明,所提出的模型在未知目标域的分类准确率高达93.5%,在已知目标域的分类准确率为89.6%,优于现有方法的最佳结果。同时,模型的查准率和召回率分别高达89.6%和90.7%,分类时间仅为1.2秒,显著提高了分类准确率和效率。结果表明,该方法能够有效提高开放集场景下图像标注和分类任务的鲁棒性和泛化能力,为解决实际场景中的域适应问题提供了新思路。