• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于空间感知主题驱动的灾难新闻图像中文标题

Spatial-aware topic-driven-based image Chinese caption for disaster news.

作者信息

Zhou Jinfei, Zhu Yaping, Zhang Yana, Yang Cheng, Pan Hong

机构信息

State Key Laboratory of Media Convergence and Communication, The Communication University of China, Beijing, 100024 China.

Data Science Research Institute, Swinburne University of Technology, Melbourne, 3122 Australia.

出版信息

Neural Comput Appl. 2023;35(13):9481-9500. doi: 10.1007/s00521-022-08072-w. Epub 2023 Mar 16.

DOI:10.1007/s00521-022-08072-w
PMID:37077618
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10019430/
Abstract

Automatically generating descriptions for disaster news images could effectively accelerate the spread of disaster message and lighten the burden of news editors from tedious news materials. Image caption algorithms are remarkable for generating captions directly from the content of the image. However, current image caption algorithms trained on existing image caption datasets fail to describe the disaster images with fundamental news elements. In this paper, we developed a large-scale disaster news image Chinese caption dataset (DNICC19k), which collected and annotated enormous news images related to disaster. Furthermore, we proposed a spatial-aware topic driven caption network (STCNet) to encode the interrelationships between these news objects and generate descriptive sentences related to news topics. STCNet firstly constructs a graph representation based on objects feature similarity. The graph reasoning module uses the spatial information to infer the weights of aggregated adjacent nodes according to a learnable Gaussian kernel function. Finally, the generation of news sentences are driven by the spatial-aware graph representations and the news topics distribution. Experimental results demonstrate that STCNet trained on DNICC19k could not only automatically creates descriptive sentences related to news topics for disaster news images, but also outperforms benchmark models such as Bottom-up, NIC, Show attend and AoANet on multiple evaluation metrics, achieving CIDEr/BLEU-4 scores of 60.26 and 17.01, respectively.

摘要

自动生成灾难新闻图片的描述可以有效地加速灾难信息的传播,并减轻新闻编辑处理繁琐新闻素材的负担。图像字幕算法在直接从图像内容生成字幕方面表现出色。然而,当前在现有图像字幕数据集上训练的图像字幕算法无法用基本的新闻元素描述灾难图像。在本文中,我们开发了一个大规模的灾难新闻图像中文字幕数据集(DNICC19k),该数据集收集并标注了大量与灾难相关的新闻图像。此外,我们提出了一种空间感知主题驱动的字幕网络(STCNet),以编码这些新闻对象之间的相互关系,并生成与新闻主题相关的描述性句子。STCNet首先基于对象特征相似性构建图表示。图推理模块使用空间信息根据可学习的高斯核函数推断聚合相邻节点的权重。最后,新闻句子的生成由空间感知图表示和新闻主题分布驱动。实验结果表明,在DNICC19k上训练的STCNet不仅可以自动为灾难新闻图像创建与新闻主题相关的描述性句子,而且在多个评估指标上优于诸如Bottom-up、NIC、Show attend和AoANet等基准模型,分别达到了60.26和17.01的CIDEr/BLEU-4分数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/29bfe7275bda/521_2022_8072_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/b796cfa45dc7/521_2022_8072_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/43716cf23357/521_2022_8072_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/83ac8ba369ca/521_2022_8072_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/89d08edcb1ee/521_2022_8072_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/0f2f55353058/521_2022_8072_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/231d1db9fc1b/521_2022_8072_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/690cae0c4d51/521_2022_8072_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/d13385e69da7/521_2022_8072_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/885a1e185556/521_2022_8072_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/1d3f2a46caa0/521_2022_8072_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/949d3678774c/521_2022_8072_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/852886a0148f/521_2022_8072_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/29bfe7275bda/521_2022_8072_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/b796cfa45dc7/521_2022_8072_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/43716cf23357/521_2022_8072_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/83ac8ba369ca/521_2022_8072_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/89d08edcb1ee/521_2022_8072_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/0f2f55353058/521_2022_8072_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/231d1db9fc1b/521_2022_8072_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/690cae0c4d51/521_2022_8072_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/d13385e69da7/521_2022_8072_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/885a1e185556/521_2022_8072_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/1d3f2a46caa0/521_2022_8072_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/949d3678774c/521_2022_8072_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/852886a0148f/521_2022_8072_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63a0/10019430/29bfe7275bda/521_2022_8072_Fig13_HTML.jpg

相似文献

1
Spatial-aware topic-driven-based image Chinese caption for disaster news.基于空间感知主题驱动的灾难新闻图像中文标题
Neural Comput Appl. 2023;35(13):9481-9500. doi: 10.1007/s00521-022-08072-w. Epub 2023 Mar 16.
2
Chinese Image Caption Generation via Visual Attention and Topic Modeling.基于视觉注意和主题建模的中文图像字幕生成。
IEEE Trans Cybern. 2022 Feb;52(2):1247-1257. doi: 10.1109/TCYB.2020.2997034. Epub 2022 Feb 16.
3
DIC-Transformer: interpretation of plant disease classification results using image caption generation technology.DIC-Transformer:利用图像字幕生成技术解释植物病害分类结果
Front Plant Sci. 2024 Jan 25;14:1273029. doi: 10.3389/fpls.2023.1273029. eCollection 2023.
4
Automatic caption generation for news images.新闻图像自动字幕生成。
IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):797-812. doi: 10.1109/TPAMI.2012.118.
5
Topic-Oriented Image Captioning Based on Order-Embedding.基于序嵌入的主题导向图像字幕生成
IEEE Trans Image Process. 2019 Jun;28(6):2743-2754. doi: 10.1109/TIP.2018.2889922. Epub 2018 Dec 27.
6
Enhancing image caption generation through context-aware attention mechanism.通过上下文感知注意力机制增强图像字幕生成
Heliyon. 2024 Aug 19;10(17):e36272. doi: 10.1016/j.heliyon.2024.e36272. eCollection 2024 Sep 15.
7
An Ensemble of Generation- and Retrieval-based Image Captioning with Dual Generator Generative Adversarial Network.基于双生成器生成对抗网络的基于生成与检索的图像字幕集成。
IEEE Trans Image Process. 2020 Oct 15;PP. doi: 10.1109/TIP.2020.3028651.
8
On Distinctive Image Captioning via Comparing and Reweighting.通过比较与重新加权实现独特图像字幕生成
IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):2088-2103. doi: 10.1109/TPAMI.2022.3159811. Epub 2023 Jan 6.
9
An image caption model based on attention mechanism and deep reinforcement learning.一种基于注意力机制和深度强化学习的图像字幕模型。
Front Neurosci. 2023 Oct 5;17:1270850. doi: 10.3389/fnins.2023.1270850. eCollection 2023.
10
Multilevel Attention Networks and Policy Reinforcement Learning for Image Caption Generation.用于图像字幕生成的多级注意力网络与策略强化学习
Big Data. 2022 Dec;10(6):481-492. doi: 10.1089/big.2021.0049. Epub 2021 Nov 2.

本文引用的文献

1
Chinese Image Caption Generation via Visual Attention and Topic Modeling.基于视觉注意和主题建模的中文图像字幕生成。
IEEE Trans Cybern. 2022 Feb;52(2):1247-1257. doi: 10.1109/TCYB.2020.2997034. Epub 2022 Feb 16.
2
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
3
Babytalk: understanding and generating simple image descriptions.婴儿语:理解和生成简单的图像描述。
IEEE Trans Pattern Anal Mach Intell. 2013 Dec;35(12):2891-903. doi: 10.1109/TPAMI.2012.162.
4
Automatic caption generation for news images.新闻图像自动字幕生成。
IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):797-812. doi: 10.1109/TPAMI.2012.118.