Scott Andrew Taylor, Narins Lothar D, Kulkarni Anagha, Castanon Mar, Kao Benjamin, Ihorn Shasta, Siu Yue-Ting, Yoon Ilmi
Department of Computer Science, San Francisco State University, San Francisco, CA, USA.
Department of Psychology, San Francisco State University, San Francisco, CA, USA.
Ext Abstr Hum Factors Computing Syst. 2023 Apr;2023. doi: 10.1145/3544549.3585632. Epub 2023 Apr 19.
How well a caption fits an image can be difficult to assess due to the subjective nature of caption quality. What is a caption? We investigate this problem by focusing on image-caption ratings and by generating high quality datasets from human feedback with gamification. We validate the datasets by showing a higher level of inter-rater agreement, and by using them to train custom machine learning models to predict new ratings. Our approach outperforms previous metrics - the resulting datasets are more easily learned and are of higher quality than other currently available datasets for image-caption rating.
由于图像描述质量具有主观性,很难评估一个图像描述与图像的匹配程度。什么是图像描述?我们通过关注图像-描述评分,并利用游戏化从人类反馈中生成高质量数据集来研究这个问题。我们通过展示更高水平的评分者间一致性,并使用这些数据集训练定制的机器学习模型来预测新的评分,从而验证了这些数据集。我们的方法优于以前的指标——由此产生的数据集比目前用于图像-描述评分的其他可用数据集更容易学习,且质量更高。