Dong Jiuqing, Zhou Heng, Fuentes Alvaro, Yoon Sook, Park Dong Sun
School of Computer and Information Engineering, Institute for Artificial Intelligence, Shanghai Polytechnic University, Shanghai, China.
Department of Electronic Engineering, Jeonbuk National University, Jeonju, Jeollabuk-do, Republic of Korea.
Front Plant Sci. 2025 Aug 15;16:1623907. doi: 10.3389/fpls.2025.1623907. eCollection 2025.
Plant diseases pose a significant threat to agriculture, impacting food security and public health. Most existing plant disease recognition methods operate within closed-set settings, where disease categories are fixed during training, making them ineffective against novel diseases. This study extends plant disease recognition to an open-set scenario, enabling the identification of both known and unknown classes for real-world applicability. We first benchmark the anomaly detection performance of three major visual frameworks-convolutional neural networks (CNNs), vision transformers (ViTs), and vision-language models (VLMs)-under varying fine-tuning strategies. To address the limitations of individual models, we propose a knowledge-ensemble-based method that integrates the general knowledge from pre-trained models with domain-specific knowledge from fine-tuned models in the logit and feature spaces. Our method significantly improves over existing baselines. For example, on vision-language models with 16-shot per class, our approach reduces the FPR@TPR95 from 43.88% to 7.05%; in the all-shot setting, it reduces the FPR@TPR95 from 15.38% to 0.71%. Extensive experiments confirm the robustness and generalizability of our approach across diverse model architectures and training paradigms. We will release the code soon at https://github.com/JiuqingDong/Enhancing_Anomaly_Detection.
植物病害对农业构成重大威胁,影响粮食安全和公众健康。大多数现有的植物病害识别方法在封闭集环境下运行,即病害类别在训练期间是固定的,这使得它们对新出现的病害无效。本研究将植物病害识别扩展到开放集场景,以便能够识别已知和未知类别,从而适用于现实世界。我们首先在不同的微调策略下,对三种主要视觉框架——卷积神经网络(CNN)、视觉Transformer(ViT)和视觉语言模型(VLM)的异常检测性能进行基准测试。为了解决单个模型的局限性,我们提出了一种基于知识集成的方法,该方法在逻辑和特征空间中将预训练模型的一般知识与微调模型的特定领域知识相结合。我们的方法比现有基线有显著改进。例如,在每类有16次样本的视觉语言模型上,我们的方法将真阳性率为95%时的误报率(FPR@TPR95)从43.88%降至7.05%;在全样本设置下,它将FPR@TPR95从15.38%降至0.71%。大量实验证实了我们的方法在不同模型架构和训练范式中的鲁棒性和通用性。我们将很快在https://github.com/JiuqingDong/Enhancing_Anomaly_Detection上发布代码。