Suppr超能文献

基于图像-文本跨模态特征融合的蔬菜叶片病害识别模型

A Vegetable Leaf Disease Identification Model Based on Image-Text Cross-Modal Feature Fusion.

作者信息

Feng Xuguang, Zhao Chunjiang, Wang Chunshan, Wu Huarui, Miao Yisheng, Zhang Jingjian

机构信息

School of Information Science and Technology, Hebei Agricultural University, Baoding, China.

National Engineering Research Center for Information Technology in Agriculture, Beijing, China.

出版信息

Front Plant Sci. 2022 Jun 24;13:918940. doi: 10.3389/fpls.2022.918940. eCollection 2022.

Abstract

In view of the differences in appearance and the complex backgrounds of crop diseases, automatic identification of field diseases is an extremely challenging topic in smart agriculture. To address this challenge, a popular approach is to design a Deep Convolutional Neural Network (DCNN) model that extracts visual disease features in the images and then identifies the diseases based on the extracted features. This approach performs well under simple background conditions, but has low accuracy and poor robustness under complex backgrounds. In this paper, an end-to-end disease identification model composed of a disease-spot region detector and a disease classifier (YOLOv5s + BiCMT) was proposed. Specifically, the YOLOv5s network was used to detect the disease-spot regions so as to provide a regional attention mechanism to facilitate the disease identification task of the classifier. For the classifier, a Bidirectional Cross-Modal Transformer (BiCMT) model combining the image and text modal information was constructed, which utilizes the correlation and complementarity between the features of the two modalities to achieve the fusion and recognition of disease features. Meanwhile, the problem of inconsistent lengths among different modal data sequences was solved. Eventually, the YOLOv5s + BiCMT model achieved the optimal results on a small dataset. Its Accuracy, Precision, Sensitivity, and Specificity reached 99.23, 97.37, 97.54, and 99.54%, respectively. This paper proves that the bidirectional cross-modal feature fusion by combining disease images and texts is an effective method to identify vegetable diseases in field environments.

摘要

鉴于作物病害外观的差异和复杂背景,田间病害的自动识别是智慧农业中极具挑战性的课题。为应对这一挑战,一种常用方法是设计深度卷积神经网络(DCNN)模型,该模型在图像中提取视觉病害特征,然后基于提取的特征识别病害。这种方法在简单背景条件下表现良好,但在复杂背景下准确率低且鲁棒性差。本文提出了一种由病斑区域检测器和病害分类器(YOLOv5s + BiCMT)组成的端到端病害识别模型。具体而言,使用YOLOv5s网络检测病斑区域,以提供区域注意力机制,便于分类器的病害识别任务。对于分类器,构建了一个结合图像和文本模态信息的双向跨模态Transformer(BiCMT)模型,该模型利用两种模态特征之间的相关性和互补性来实现病害特征的融合与识别。同时,解决了不同模态数据序列长度不一致的问题。最终,YOLOv5s + BiCMT模型在一个小数据集上取得了最优结果。其准确率、精确率、灵敏度和特异性分别达到了99.23%、97.37%、97.54%和99.54%。本文证明了结合病害图像和文本进行双向跨模态特征融合是识别田间环境中蔬菜病害的有效方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2379/9263697/cfa475e61666/fpls-13-918940-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验