Hassan Sk Mahmudul, Roy Kumar Sekhar, Hazarika Ruhul Amin, Alam Mehbub, Mukherjee Mithun
Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
Department of IT, Indian Institute of Information Technology, Guwahati, Assam, 781015, India.
Sci Rep. 2025 Aug 23;15(1):30997. doi: 10.1038/s41598-025-16142-x.
The timely and precise identification of diseases in plants is essential for efficient disease control and safeguarding of crops. Manual identification of diseases requires expert knowledge in the field, and finding people with domain knowledge is challenging. To overcome the challenge, computer vision-based machine learning techniques have been proposed by the researchers in recent years. Most of these solutions with the standard convolutional neural network (CNN) approaches use uniform background laboratory setup leaf images to identify the diseases. However, only a few works considered real-field images in their work. Therefore, there is a need for a robust CNN architecture that can identify the diseases in plants in both laboratory and real-field conditioned images. In this paper, we have proposed an Inception-Enhanced Vision Transformer (IEViT) architecture to identify diseases in plants. The proposed IEViT architecture extracts local as well as global features, which improves feature learning. The use of multiple filters with different kernel sizes efficiently uses computing resources to extract relevant features without the need for deeper networks. The robustness of the proposed architecture is established by hyper-parameter tuning and comparison with state-of-the-art. In the experiment, we consider five datasets with both laboratory-conditioned and real-field conditioned images. From the experimental results, we see that the proposed model outperforms state-of-the-art deep learning models with fewer parameters. The proposed model achieves an accuracy rate of 99.23% for the apple leaf dataset, 99.70% for the rice dataset, 97.02% for the ibean dataset, 76.51% for the cassava leaf dataset, and 99.41% for the plantvillage dataset.
及时、准确地识别植物病害对于有效控制病害和保护作物至关重要。人工识别病害需要该领域的专业知识,而找到具备专业知识的人员具有挑战性。为了克服这一挑战,近年来研究人员提出了基于计算机视觉的机器学习技术。大多数采用标准卷积神经网络(CNN)方法的解决方案使用背景统一的实验室设置下的叶片图像来识别病害。然而,只有少数研究在其工作中考虑了实际田间图像。因此,需要一种强大的CNN架构,能够在实验室条件图像和实际田间条件图像中识别植物病害。在本文中,我们提出了一种用于识别植物病害的Inception增强视觉Transformer(IEViT)架构。所提出的IEViT架构提取局部和全局特征,这提高了特征学习能力。使用具有不同内核大小的多个滤波器有效地利用计算资源来提取相关特征,而无需更深的网络。通过超参数调整和与现有技术的比较,确立了所提出架构的鲁棒性。在实验中,我们考虑了五个包含实验室条件图像和实际田间条件图像的数据集。从实验结果来看,我们发现所提出的模型在参数较少的情况下优于现有深度学习模型。对于苹果叶数据集,所提出的模型准确率达到99.23%;对于水稻数据集,准确率为99.70%;对于菜豆数据集,准确率为97.02%;对于木薯叶数据集,准确率为76.51%;对于植物村数据集,准确率为99.41%。