Zadeh Amir, Cao Yan Sheng, Hessner Simon, Liang Paul Pu, Poria Soujanya, Morency Louis-Philippe
LTI, SCS, CMU.
SCS, CMU.
Proc Conf Empir Methods Nat Lang Process. 2020 Nov;2020:1801-1812. doi: 10.18653/v1/2020.emnlp-main.141.
Modeling multimodal language is a core research area in natural language processing. While languages such as English have relatively large multimodal language resources, other widely spoken languages across the globe have few or no large-scale datasets in this area. This disproportionately affects native speakers of languages other than English. As a step towards building more equitable and inclusive multimodal systems, we introduce the first large-scale multimodal language dataset for Spanish, Portuguese, German and French. The proposed dataset, called CMU-MOSEAS (CMU Multimodal Opinion Sentiment, Emotions and Attributes), is the largest of its kind with 40, 000 total labelled sentences. It covers a diverse set topics and speakers, and carries supervision of 20 labels including sentiment (and subjectivity), emotions, and attributes. Our evaluations on a state-of-the-art multimodal model demonstrates that CMU-MOSEAS enables further research for multilingual studies in multimodal language.
多模态语言建模是自然语言处理中的一个核心研究领域。虽然像英语这样的语言拥有相对丰富的多模态语言资源,但全球其他广泛使用的语言在这一领域几乎没有或根本没有大规模数据集。这对非英语母语者产生了不成比例的影响。作为构建更公平、更具包容性的多模态系统的一步,我们推出了首个针对西班牙语、葡萄牙语、德语和法语的大规模多模态语言数据集。这个名为CMU-MOSEAS(卡内基梅隆大学多模态观点、情感、情绪和属性)的数据集是同类数据集中最大的,共有40000个带标签的句子。它涵盖了各种不同的主题和说话者,并带有包括情感(和主观性)、情绪及属性在内的20种标签的标注。我们在一个先进的多模态模型上进行的评估表明,CMU-MOSEAS能够推动多模态语言中多语言研究的进一步发展。