Suppr超能文献

视觉与语言研究中的挑战与前景

Challenges and Prospects in Vision and Language Research.

作者信息

Kafle Kushal, Shrestha Robik, Kanan Christopher

机构信息

Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, United States.

Paige, New York, NY, United States.

出版信息

Front Artif Intell. 2019 Dec 13;2:28. doi: 10.3389/frai.2019.00028. eCollection 2019.

Abstract

Language grounded image understanding tasks have often been proposed as a method for evaluating progress in artificial intelligence. Ideally, these tasks should test a plethora of capabilities that integrate computer vision, reasoning, and natural language understanding. However, the datasets and evaluation procedures used in these tasks are replete with flaws which allows the vision and language (V&L) algorithms to achieve a good performance without a robust understanding of vision and language. We argue for this position based on several recent studies in V&L literature and our own observations of dataset bias, robustness, and spurious correlations. Finally, we propose that several of these challenges can be mitigated by creation of carefully designed benchmarks.

摘要

基于语言的图像理解任务经常被提议作为评估人工智能进展的一种方法。理想情况下,这些任务应该测试大量整合了计算机视觉、推理和自然语言理解的能力。然而,这些任务中使用的数据集和评估程序存在大量缺陷,这使得视觉与语言(V&L)算法在没有对视觉和语言进行稳健理解的情况下就能取得良好性能。基于V&L文献中的几项最新研究以及我们自己对数据集偏差、稳健性和虚假相关性的观察,我们支持这一观点。最后,我们提出通过创建精心设计的基准可以缓解其中的一些挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62d5/7861287/dc965b95e658/frai-02-00028-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验