Saadati Mojdeh, Balu Aditya, Chiranjeevi Shivani, Jubery Talukder Zaki, Singh Asheesh K, Sarkar Soumik, Singh Arti, Ganapathysubramanian Baskar
Department of Computer Science, Iowa State University, Ames, IA, USA.
Department of Mechanical Engineering, Iowa State University, Ames, IA, USA.
Plant Phenomics. 2024 Apr 30;6:0170. doi: 10.34133/plantphenomics.0170. eCollection 2024.
Plants encounter a variety of beneficial and harmful insects during their growth cycle. Accurate identification (i.e., detecting insects' presence) and classification (i.e., determining the type or class) of these insect species is critical for implementing prompt and suitable mitigation strategies. Such timely actions carry substantial economic and environmental implications. Deep learning-based approaches have produced models with good insect classification accuracy. Researchers aim to implement identification and classification models in agriculture, facing challenges when input images markedly deviate from the training distribution (e.g., images like vehicles, humans, or a blurred image or insect class that is not yet trained on). Out-of-distribution (OOD) detection algorithms provide an exciting avenue to overcome these challenges as they ensure that a model abstains from making incorrect classification predictions on images that belong to non-insect and/or untrained insect classes. As far as we know, no prior in-depth exploration has been conducted on the role of the OOD detection algorithms in addressing agricultural issues. Here, we generate and evaluate the performance of state-of-the-art OOD algorithms on insect detection classifiers. These algorithms represent a diversity of methods for addressing an OOD problem. Specifically, we focus on extrusive algorithms, i.e., algorithms that wrap around a well-trained classifier without the need for additional co-training. We compared three OOD detection algorithms: (a) maximum softmax probability, which uses the softmax value as a confidence score; (b) Mahalanobis distance (MAH)-based algorithm, which uses a generative classification approach; and (c) energy-based algorithm, which maps the input data to a scalar value, called energy. We performed an extensive series of evaluations of these OOD algorithms across three performance axes: (a) Base model accuracy: How does the accuracy of the classifier impact OOD performance? (b) How does the level of dissimilarity to the domain impact OOD performance? (c) Data imbalance: How sensitive is OOD performance to the imbalance in per-class sample size? Evaluating OOD algorithms across these performance axes provides practical guidelines to ensure the robust performance of well-trained models in the wild, which is a key consideration for agricultural applications. Based on this analysis, we proposed the most effective OOD algorithm as wrapper for the insect classifier with highest accuracy. We presented the results of its OOD detection performance in the paper. Our results indicate that OOD detection algorithms can significantly enhance user trust in insect pest classification by abstaining classification under uncertain conditions.
植物在其生长周期中会遇到各种各样的益虫和害虫。准确识别(即检测昆虫的存在)和分类(即确定昆虫种类)对于实施及时且合适的缓解策略至关重要。这些及时的行动具有重大的经济和环境意义。基于深度学习的方法已经产生了具有良好昆虫分类准确率的模型。研究人员旨在将识别和分类模型应用于农业,但当输入图像明显偏离训练分布时(例如车辆、人类的图像,或者模糊图像或尚未在其上进行训练的昆虫类别),他们面临着挑战。分布外(OOD)检测算法提供了一条令人兴奋的途径来克服这些挑战,因为它们确保模型避免对属于非昆虫和/或未训练昆虫类别的图像做出错误的分类预测。据我们所知,此前尚未对OOD检测算法在解决农业问题中的作用进行过深入探索。在此,我们生成并评估了最先进的OOD算法在昆虫检测分类器上的性能。这些算法代表了多种解决OOD问题的方法。具体而言,我们关注挤出式算法,即无需额外协同训练就能围绕一个训练良好的分类器的算法。我们比较了三种OOD检测算法:(a)最大softmax概率,它使用softmax值作为置信度得分;(b)基于马氏距离(MAH)的算法,它使用生成式分类方法;(c)基于能量的算法,它将输入数据映射到一个标量值,称为能量。我们在三个性能轴上对这些OOD算法进行了一系列广泛的评估:(a)基础模型准确率:分类器的准确率如何影响OOD性能?(b)与领域的差异程度如何影响OOD性能?(c)数据不平衡:OOD性能对每类样本大小的不平衡有多敏感?在这些性能轴上评估OOD算法提供了实用指南,以确保训练良好的模型在实际应用中的稳健性能,这是农业应用中的一个关键考虑因素。基于此分析,我们提出了最有效的OOD算法作为具有最高准确率的昆虫分类器的包装器。我们在论文中展示了其OOD检测性能的结果。我们的结果表明,OOD检测算法可以通过在不确定条件下 abstaining classification 来显著增强用户对害虫分类的信任。 (注:原文中“abstaining classification”表述有误,可能是“abstaining from classification”,翻译为“避免分类” )