Suppr超能文献

改进的图像描述评级——数据集、游戏和模型。

Improved Image Caption Rating - Datasets, Game, and Model.

作者信息

Scott Andrew Taylor, Narins Lothar D, Kulkarni Anagha, Castanon Mar, Kao Benjamin, Ihorn Shasta, Siu Yue-Ting, Yoon Ilmi

机构信息

Department of Computer Science, San Francisco State University, San Francisco, CA, USA.

Department of Psychology, San Francisco State University, San Francisco, CA, USA.

出版信息

Ext Abstr Hum Factors Computing Syst. 2023 Apr;2023. doi: 10.1145/3544549.3585632. Epub 2023 Apr 19.

Abstract

How well a caption fits an image can be difficult to assess due to the subjective nature of caption quality. What is a caption? We investigate this problem by focusing on image-caption ratings and by generating high quality datasets from human feedback with gamification. We validate the datasets by showing a higher level of inter-rater agreement, and by using them to train custom machine learning models to predict new ratings. Our approach outperforms previous metrics - the resulting datasets are more easily learned and are of higher quality than other currently available datasets for image-caption rating.

摘要

由于图像描述质量具有主观性,很难评估一个图像描述与图像的匹配程度。什么是图像描述?我们通过关注图像-描述评分,并利用游戏化从人类反馈中生成高质量数据集来研究这个问题。我们通过展示更高水平的评分者间一致性,并使用这些数据集训练定制的机器学习模型来预测新的评分,从而验证了这些数据集。我们的方法优于以前的指标——由此产生的数据集比目前用于图像-描述评分的其他可用数据集更容易学习,且质量更高。

相似文献

1
Improved Image Caption Rating - Datasets, Game, and Model.
Ext Abstr Hum Factors Computing Syst. 2023 Apr;2023. doi: 10.1145/3544549.3585632. Epub 2023 Apr 19.
2
Topic-Oriented Image Captioning Based on Order-Embedding.
IEEE Trans Image Process. 2019 Jun;28(6):2743-2754. doi: 10.1109/TIP.2018.2889922. Epub 2018 Dec 27.
3
Arabic Captioning for Images of Clothing Using Deep Learning.
Sensors (Basel). 2023 Apr 7;23(8):3783. doi: 10.3390/s23083783.
4
Image Captioning Using Motion-CNN with Object Detection.
Sensors (Basel). 2021 Feb 10;21(4):1270. doi: 10.3390/s21041270.
5
A Multilevel Transfer Learning Technique and LSTM Framework for Generating Medical Captions for Limited CT and DBT Images.
J Digit Imaging. 2022 Jun;35(3):564-580. doi: 10.1007/s10278-021-00567-7. Epub 2022 Feb 25.
6
Center-enhanced video captioning model with multimodal semantic alignment.
Neural Netw. 2024 Dec;180:106744. doi: 10.1016/j.neunet.2024.106744. Epub 2024 Sep 18.
8
Weakly Supervised Captioning of Ultrasound Images.
Med Image Underst Anal (2022). 2022 Jul;13413:187-198. doi: 10.1007/978-3-031-12053-4_14.
9
Re-Caption: Saliency-Enhanced Image Captioning through Two-Phase Learning.
IEEE Trans Image Process. 2019 Jul 17. doi: 10.1109/TIP.2019.2928144.

本文引用的文献

1
From Show to Tell: A Survey on Deep Learning-Based Image Captioning.
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):539-559. doi: 10.1109/TPAMI.2022.3148210. Epub 2022 Dec 5.
2
Towards Generating and Evaluating Iconographic Image Captions of Artworks.
J Imaging. 2021 Jul 23;7(8):123. doi: 10.3390/jimaging7080123.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验