改进的图像描述评级——数据集、游戏和模型。

Improved Image Caption Rating - Datasets, Game, and Model.

作者信息

Scott Andrew Taylor, Narins Lothar D, Kulkarni Anagha, Castanon Mar, Kao Benjamin, Ihorn Shasta, Siu Yue-Ting, Yoon Ilmi

机构信息

Department of Computer Science, San Francisco State University, San Francisco, CA, USA.

Department of Psychology, San Francisco State University, San Francisco, CA, USA.

出版信息

Ext Abstr Hum Factors Computing Syst. 2023 Apr;2023. doi: 10.1145/3544549.3585632. Epub 2023 Apr 19.

DOI:10.1145/3544549.3585632

PMID:38545917

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10962002/

Abstract

How well a caption fits an image can be difficult to assess due to the subjective nature of caption quality. What is a caption? We investigate this problem by focusing on image-caption ratings and by generating high quality datasets from human feedback with gamification. We validate the datasets by showing a higher level of inter-rater agreement, and by using them to train custom machine learning models to predict new ratings. Our approach outperforms previous metrics - the resulting datasets are more easily learned and are of higher quality than other currently available datasets for image-caption rating.

摘要

由于图像描述质量具有主观性，很难评估一个图像描述与图像的匹配程度。什么是图像描述？我们通过关注图像-描述评分，并利用游戏化从人类反馈中生成高质量数据集来研究这个问题。我们通过展示更高水平的评分者间一致性，并使用这些数据集训练定制的机器学习模型来预测新的评分，从而验证了这些数据集。我们的方法优于以前的指标——由此产生的数据集比目前用于图像-描述评分的其他可用数据集更容易学习，且质量更高。