• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DIC-Transformer:利用图像字幕生成技术解释植物病害分类结果

DIC-Transformer: interpretation of plant disease classification results using image caption generation technology.

作者信息

Zeng Qingtian, Sun Jian, Wang Shansong

机构信息

College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China.

出版信息

Front Plant Sci. 2024 Jan 25;14:1273029. doi: 10.3389/fpls.2023.1273029. eCollection 2023.

DOI:10.3389/fpls.2023.1273029
PMID:38333041
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10850568/
Abstract

Disease image classification systems play a crucial role in identifying disease categories in the field of agricultural diseases. However, current plant disease image classification methods can only predict the disease category and do not offer explanations for the characteristics of the predicted disease images. Due to the current situation, this paper employed image description generation technology to produce distinct descriptions for different plant disease categories. A two-stage model called DIC-Transformer, which encompasses three tasks (detection, interpretation, and classification), was proposed. In the first stage, Faster R-CNN was utilized to detect the diseased area and generate the feature vector of the diseased image, with the Swin Transformer as the backbone. In the second stage, the model utilized the Transformer to generate image captions. It then generated the image feature vector, which is weighted by text features, to improve the performance of image classification in the subsequent classification decoder. Additionally, a dataset containing text and visualizations for agricultural diseases (ADCG-18) was compiled. The dataset contains images of 18 diseases and descriptive information about their characteristics. Then, using the ADCG-18, the DIC-Transformer was compared to 11 existing classical caption generation methods and 10 image classification models. The evaluation indicators for captions include Bleu1-4, CiderD, and Rouge. The values of BLEU-1, CIDEr-D, and ROUGE were 0.756, 450.51, and 0.721. The results of DIC-Transformer were 0.01, 29.55, and 0.014 higher than those of the highest-performing comparison model, Fc. The classification evaluation metrics include accuracy, recall, and F1 score, with accuracy at 0.854, recall at 0.854, and F1 score at 0.853. The results of DIC-Transformer were 0.024, 0.078, and 0.075 higher than those of the highest-performing comparison model, MobileNetV2. The results indicate that the DIC-Transformer outperforms other comparison models in classification and caption generation.

摘要

病害图像分类系统在农业病害领域的病害类别识别中起着至关重要的作用。然而,当前的植物病害图像分类方法只能预测病害类别,无法对预测的病害图像特征做出解释。鉴于此现状,本文采用图像描述生成技术为不同的植物病害类别生成独特的描述。提出了一种名为DIC-Transformer的两阶段模型,该模型包含三个任务(检测、解释和分类)。在第一阶段,以Swin Transformer为骨干网络,利用Faster R-CNN检测患病区域并生成患病图像的特征向量。在第二阶段,模型利用Transformer生成图像字幕。然后生成由文本特征加权的图像特征向量,以提高后续分类解码器中图像分类的性能。此外,还编制了一个包含农业病害文本和可视化内容的数据集(ADCG-18)。该数据集包含18种病害的图像及其特征描述信息。然后,使用ADCG-18将DIC-Transformer与11种现有的经典字幕生成方法和10种图像分类模型进行比较。字幕的评估指标包括Bleu1-4、CiderD和Rouge。BLEU-1、CIDEr-D和ROUGE的值分别为0.756、450.51和0.721。DIC-Transformer的结果比表现最佳的比较模型Fc分别高出0.01、29.55和0.014。分类评估指标包括准确率、召回率和F1分数,准确率为0.854,召回率为0.854,F1分数为0.853。DIC-Transformer的结果比表现最佳的比较模型MobileNetV2分别高出0.024、0.078和0.075。结果表明,DIC-Transformer在分类和字幕生成方面优于其他比较模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/e02b66f173a3/fpls-14-1273029-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/2d89a9df14d5/fpls-14-1273029-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/a98e8261b888/fpls-14-1273029-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/3c1b9a5a2798/fpls-14-1273029-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/55084e2acebd/fpls-14-1273029-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/fdcf07d8ece8/fpls-14-1273029-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/889bbdaa3e92/fpls-14-1273029-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/83634def4cad/fpls-14-1273029-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/e02b66f173a3/fpls-14-1273029-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/2d89a9df14d5/fpls-14-1273029-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/a98e8261b888/fpls-14-1273029-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/3c1b9a5a2798/fpls-14-1273029-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/55084e2acebd/fpls-14-1273029-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/fdcf07d8ece8/fpls-14-1273029-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/889bbdaa3e92/fpls-14-1273029-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/83634def4cad/fpls-14-1273029-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4f5/10850568/e02b66f173a3/fpls-14-1273029-g008.jpg

相似文献

1
DIC-Transformer: interpretation of plant disease classification results using image caption generation technology.DIC-Transformer:利用图像字幕生成技术解释植物病害分类结果
Front Plant Sci. 2024 Jan 25;14:1273029. doi: 10.3389/fpls.2023.1273029. eCollection 2023.
2
Swin-GA-RF: genetic algorithm-based Swin Transformer and random forest for enhancing cervical cancer classification.Swin-GA-RF:基于遗传算法的Swin Transformer和随机森林用于增强宫颈癌分类
Front Oncol. 2024 Jul 19;14:1392301. doi: 10.3389/fonc.2024.1392301. eCollection 2024.
3
Spatial-aware topic-driven-based image Chinese caption for disaster news.基于空间感知主题驱动的灾难新闻图像中文标题
Neural Comput Appl. 2023;35(13):9481-9500. doi: 10.1007/s00521-022-08072-w. Epub 2023 Mar 16.
4
Analysis of CT scan images for COVID-19 pneumonia based on a deep ensemble framework with DenseNet, Swin transformer, and RegNet.基于带有密集连接网络(DenseNet)、斯温变压器(Swin transformer)和雷吉网络(RegNet)的深度集成框架对新冠肺炎肺炎的CT扫描图像进行分析。
Front Microbiol. 2022 Sep 23;13:995323. doi: 10.3389/fmicb.2022.995323. eCollection 2022.
5
Multi-modal transformer architecture for medical image analysis and automated report generation.多模态转换器架构在医学图像分析和自动报告生成中的应用。
Sci Rep. 2024 Aug 20;14(1):19281. doi: 10.1038/s41598-024-69981-5.
6
Weakly Supervised Captioning of Ultrasound Images.超声图像的弱监督字幕生成
Med Image Underst Anal (2022). 2022 Jul;13413:187-198. doi: 10.1007/978-3-031-12053-4_14.
7
MBT: Model-Based Transformer for retinal optical coherence tomography image and video multi-classification.MBT:用于视网膜光学相干断层扫描图像和视频多分类的基于模型的Transformer
Int J Med Inform. 2023 Oct;178:105178. doi: 10.1016/j.ijmedinf.2023.105178. Epub 2023 Aug 21.
8
Image-based scatter correction for cone-beam CT using flip swin transformer U-shape network.基于 Flip-Swin 变压器 U 形网络的锥形束 CT 图像散射校正。
Med Phys. 2023 Aug;50(8):5002-5019. doi: 10.1002/mp.16277. Epub 2023 Feb 14.
9
HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification.HyFormer:用于像素级多光谱图像土地覆盖分类的混合 Transformer 和 CNN。
Int J Environ Res Public Health. 2023 Feb 9;20(4):3059. doi: 10.3390/ijerph20043059.
10
An image caption model based on attention mechanism and deep reinforcement learning.一种基于注意力机制和深度强化学习的图像字幕模型。
Front Neurosci. 2023 Oct 5;17:1270850. doi: 10.3389/fnins.2023.1270850. eCollection 2023.

引用本文的文献

1
Plant leaf disease recognition based on improved SinGAN and improved ResNet34.基于改进的SinGAN和改进的ResNet34的植物叶片病害识别
Front Artif Intell. 2024 Jun 24;7:1414274. doi: 10.3389/frai.2024.1414274. eCollection 2024.
2
RNS2 is required for the biogenesis of a wounding responsive 16 nts tsRNA in Arabidopsis thaliana.RNS2 对于拟南芥中伤诱导的 16nt tsRNA 的生物发生是必需的。
Plant Mol Biol. 2024 Jan 24;114(1):6. doi: 10.1007/s11103-023-01399-5.

本文引用的文献

1
A longan yield estimation approach based on UAV images and deep learning.一种基于无人机图像和深度学习的龙眼产量估计方法。
Front Plant Sci. 2023 Mar 6;14:1132909. doi: 10.3389/fpls.2023.1132909. eCollection 2023.
2
Variational Autoencoders-BasedSelf-Learning Model for Tumor Identification and Impact Analysis from 2-D MRI Images.基于变分自动编码器的自学习模型,用于从 2-D MRI 图像中识别肿瘤并进行影响分析。
J Healthc Eng. 2023 Jan 17;2023:1566123. doi: 10.1155/2023/1566123. eCollection 2023.
3
Image-Based Automated Recognition of 31 Poaceae Species: The Most Relevant Perspectives.
基于图像的31种禾本科植物自动识别:最相关的视角
Front Plant Sci. 2022 Jan 26;12:804140. doi: 10.3389/fpls.2021.804140. eCollection 2021.
4
On Diversity in Image Captioning: Metrics and Methods.图像字幕中的多样性:度量与方法。
IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):1035-1049. doi: 10.1109/TPAMI.2020.3013834. Epub 2022 Jan 7.
5
Res2Net: A New Multi-Scale Backbone Architecture.Res2Net:一种新的多尺度骨干网络架构。
IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):652-662. doi: 10.1109/TPAMI.2019.2938758. Epub 2021 Jan 8.
6
Convolutional Neural Networks for the Automatic Identification of Plant Diseases.用于植物病害自动识别的卷积神经网络
Front Plant Sci. 2019 Jul 23;10:941. doi: 10.3389/fpls.2019.00941. eCollection 2019.
7
Attentive Linear Transformation for Image Captioning.用于图像字幕的注意力线性变换
IEEE Trans Image Process. 2018 Jul 12. doi: 10.1109/TIP.2018.2855406.
8
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
9
Babytalk: understanding and generating simple image descriptions.婴儿语:理解和生成简单的图像描述。
IEEE Trans Pattern Anal Mach Intell. 2013 Dec;35(12):2891-903. doi: 10.1109/TPAMI.2012.162.