• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于视觉语言模型的跨模态数据融合用于作物病害识别

Cross-Modal Data Fusion via Vision-Language Model for Crop Disease Recognition.

作者信息

Liu Wenjie, Wu Guoqing, Wang Han, Ren Fuji

机构信息

School of Transportation and Civil Engineering, Nantong University, Nantong 226019, China.

School of Mechanical Engineering, Nantong Institute of Technology, Nantong 226002, China.

出版信息

Sensors (Basel). 2025 Jun 30;25(13):4096. doi: 10.3390/s25134096.

DOI:10.3390/s25134096
PMID:40648350
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12251865/
Abstract

Crop diseases pose a significant threat to agricultural productivity and global food security. Timely and accurate disease identification is crucial for improving crop yield and quality. While most existing deep learning-based methods focus primarily on image datasets for disease recognition, they often overlook the complementary role of textual features in enhancing visual understanding. To address this problem, we proposed a cross-modal data fusion via a vision-language model for crop disease recognition. Our approach leverages the Zhipu.ai multi-model to generate comprehensive textual descriptions of crop leaf diseases, including global description, local lesion description, and color-texture description. These descriptions are encoded into feature vectors, while an image encoder extracts image features. A cross-attention mechanism then iteratively fuses multimodal features across multiple layers, and a classification prediction module generates classification probabilities. Extensive experiments on the Soybean Disease, AI Challenge 2018, and PlantVillage datasets demonstrate that our method outperforms state-of-the-art image-only approaches with higher accuracy and fewer parameters. Specifically, with only 1.14M model parameters, our model achieves a 98.74%, 87.64% and 99.08% recognition accuracy on the three datasets, respectively. The results highlight the effectiveness of cross-modal learning in leveraging both visual and textual cues for precise and efficient disease recognition, offering a scalable solution for crop disease recognition.

摘要

作物病害对农业生产力和全球粮食安全构成重大威胁。及时准确的病害识别对于提高作物产量和质量至关重要。虽然大多数现有的基于深度学习的方法主要侧重于用于病害识别的图像数据集,但它们往往忽视了文本特征在增强视觉理解方面的补充作用。为了解决这个问题,我们提出了一种通过视觉语言模型进行跨模态数据融合的作物病害识别方法。我们的方法利用智谱AI多模型生成作物叶片病害的全面文本描述,包括全局描述、局部病斑描述和颜色纹理描述。这些描述被编码为特征向量,同时图像编码器提取图像特征。然后,交叉注意力机制在多个层上迭代融合多模态特征,分类预测模块生成分类概率。在大豆病害数据集、2018年人工智能挑战赛数据集和植物村数据集上进行的大量实验表明,我们的方法以更高的准确率和更少的参数优于现有的仅基于图像的方法。具体而言,我们的模型仅具有114万个模型参数,在这三个数据集上分别实现了98.74%、87.64%和99.08%的识别准确率。结果突出了跨模态学习在利用视觉和文本线索进行精确高效病害识别方面的有效性,为作物病害识别提供了一种可扩展的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/9c3d5e556fae/sensors-25-04096-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/75099bf90839/sensors-25-04096-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/ccee84bc3c23/sensors-25-04096-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/e60eab5994d3/sensors-25-04096-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/9c3d5e556fae/sensors-25-04096-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/75099bf90839/sensors-25-04096-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/ccee84bc3c23/sensors-25-04096-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/e60eab5994d3/sensors-25-04096-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a76/12251865/9c3d5e556fae/sensors-25-04096-g004.jpg

相似文献

1
Cross-Modal Data Fusion via Vision-Language Model for Crop Disease Recognition.基于视觉语言模型的跨模态数据融合用于作物病害识别
Sensors (Basel). 2025 Jun 30;25(13):4096. doi: 10.3390/s25134096.
2
ST-CFI: Swin Transformer with convolutional feature interactions for identifying plant diseases.ST-CFI:具有卷积特征交互的Swin Transformer用于识别植物病害。
Sci Rep. 2025 Jul 11;15(1):25000. doi: 10.1038/s41598-025-08673-0.
3
Spatial attention-guided pre-trained networks for accurate identification of crop diseases.用于精确识别作物病害的空间注意力引导预训练网络。
Sci Rep. 2025 Jul 2;15(1):23213. doi: 10.1038/s41598-025-08004-3.
4
A large language model for multimodal identification of crop diseases and pests.一种用于作物病虫害多模态识别的大语言模型。
Sci Rep. 2025 Jul 1;15(1):21959. doi: 10.1038/s41598-025-01908-0.
5
Development of a handheld GPU-assisted DSC-TransNet model for the real-time classification of plant leaf disease using deep learning approach.基于深度学习方法开发用于植物叶片病害实时分类的手持式GPU辅助DSC-TransNet模型。
Sci Rep. 2025 Jan 28;15(1):3579. doi: 10.1038/s41598-024-82629-8.
6
VIIDA and InViDe: computational approaches for generating and evaluating inclusive image paragraphs for the visually impaired.VIIDA和InViDe:为视障人士生成和评估包容性图像段落的计算方法。
Disabil Rehabil Assist Technol. 2025 Jul;20(5):1470-1495. doi: 10.1080/17483107.2024.2437567. Epub 2024 Dec 11.
7
Structural semantic-guided MR synthesis from PET images via a dual cross-attention mechanism.通过双交叉注意力机制从PET图像进行结构语义引导的MR合成。
Med Phys. 2025 Jul;52(7):e17957. doi: 10.1002/mp.17957.
8
A novel deep learning framework for retinal disease detection leveraging contextual and local features cues from retinal images.一种用于视网膜疾病检测的新型深度学习框架,利用来自视网膜图像的上下文和局部特征线索。
Med Biol Eng Comput. 2025 Feb 7. doi: 10.1007/s11517-025-03314-0.
9
Multiclass semantic segmentation for prime disease detection with severity level identification in Citrus plant leaves.用于柑橘植物叶片主要病害检测及严重程度识别的多类语义分割
Sci Rep. 2025 Jul 1;15(1):21208. doi: 10.1038/s41598-025-04758-y.
10
Short-Term Memory Impairment短期记忆障碍

本文引用的文献

1
Precision Agriculture Using Soil Sensor Driven Machine Learning for Smart Strawberry Production.基于土壤传感器驱动的机器学习的精准农业在智能草莓生产中的应用。
Sensors (Basel). 2023 Feb 16;23(4):2247. doi: 10.3390/s23042247.
2
Current State of Hyperspectral Remote Sensing for Early Plant Disease Detection: A Review.高光谱遥感在早期植物病害检测中的应用现状:综述
Sensors (Basel). 2022 Jan 19;22(3):757. doi: 10.3390/s22030757.
3
The global burden of pathogens and pests on major food crops.主要粮食作物的病原体和害虫的全球负担。
Nat Ecol Evol. 2019 Mar;3(3):430-439. doi: 10.1038/s41559-018-0793-y. Epub 2019 Feb 4.
4
The Future of Nanotechnology in Plant Pathology.纳米技术在植物病理学中的未来。
Annu Rev Phytopathol. 2018 Aug 25;56:111-133. doi: 10.1146/annurev-phyto-080417-050108.