Suppr超能文献

CMU-MOSEAS:一个用于西班牙语、葡萄牙语、德语和法语的多模态语言数据集。

CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French.

作者信息

Zadeh Amir, Cao Yan Sheng, Hessner Simon, Liang Paul Pu, Poria Soujanya, Morency Louis-Philippe

机构信息

LTI, SCS, CMU.

SCS, CMU.

出版信息

Proc Conf Empir Methods Nat Lang Process. 2020 Nov;2020:1801-1812. doi: 10.18653/v1/2020.emnlp-main.141.

Abstract

Modeling multimodal language is a core research area in natural language processing. While languages such as English have relatively large multimodal language resources, other widely spoken languages across the globe have few or no large-scale datasets in this area. This disproportionately affects native speakers of languages other than English. As a step towards building more equitable and inclusive multimodal systems, we introduce the first large-scale multimodal language dataset for Spanish, Portuguese, German and French. The proposed dataset, called CMU-MOSEAS (CMU Multimodal Opinion Sentiment, Emotions and Attributes), is the largest of its kind with 40, 000 total labelled sentences. It covers a diverse set topics and speakers, and carries supervision of 20 labels including sentiment (and subjectivity), emotions, and attributes. Our evaluations on a state-of-the-art multimodal model demonstrates that CMU-MOSEAS enables further research for multilingual studies in multimodal language.

摘要

多模态语言建模是自然语言处理中的一个核心研究领域。虽然像英语这样的语言拥有相对丰富的多模态语言资源,但全球其他广泛使用的语言在这一领域几乎没有或根本没有大规模数据集。这对非英语母语者产生了不成比例的影响。作为构建更公平、更具包容性的多模态系统的一步,我们推出了首个针对西班牙语、葡萄牙语、德语和法语的大规模多模态语言数据集。这个名为CMU-MOSEAS(卡内基梅隆大学多模态观点、情感、情绪和属性)的数据集是同类数据集中最大的,共有40000个带标签的句子。它涵盖了各种不同的主题和说话者,并带有包括情感(和主观性)、情绪及属性在内的20种标签的标注。我们在一个先进的多模态模型上进行的评估表明,CMU-MOSEAS能够推动多模态语言中多语言研究的进一步发展。

相似文献

2
Lexical simplification benchmarks for English, Portuguese, and Spanish.英语、葡萄牙语和西班牙语的词汇简化基准。
Front Artif Intell. 2022 Sep 22;5:991242. doi: 10.3389/frai.2022.991242. eCollection 2022.
5
Integrating Multimodal Information in Large Pretrained Transformers.在大型预训练变压器中整合多模态信息。
Proc Conf Assoc Comput Linguist Meet. 2020 Jul;2020:2359-2369. doi: 10.18653/v1/2020.acl-main.214.
7
Multilingual event extraction for epidemic detection.用于疫情检测的多语言事件提取
Artif Intell Med. 2015 Oct;65(2):131-43. doi: 10.1016/j.artmed.2015.06.005. Epub 2015 Jul 17.
8
Semi-supervised word polarity identification in resource-lean languages.资源匮乏语言中的半监督词极性识别
Neural Netw. 2014 Oct;58:50-9. doi: 10.1016/j.neunet.2014.05.018. Epub 2014 Jun 4.
10
iSentenizer-μ: multilingual sentence boundary detection model.iSentenizer-μ:多语言句子边界检测模型。
ScientificWorldJournal. 2014;2014:196574. doi: 10.1155/2014/196574. Epub 2014 Apr 15.

本文引用的文献

1
ArcFace: Additive Angular Margin Loss for Deep Face Recognition.ArcFace:用于深度人脸识别的附加角度间隔损失。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):5962-5979. doi: 10.1109/TPAMI.2021.3087709. Epub 2022 Sep 14.
2
Multimodal Transformer for Unaligned Multimodal Language Sequences.用于未对齐多模态语言序列的多模态变换器
Proc Conf Assoc Comput Linguist Meet. 2019 Jul;2019:6558-6569. doi: 10.18653/v1/p19-1656.
4
Why We Should Study Multimodal Language.我们为何要研究多模态语言。
Front Psychol. 2018 Jun 28;9:1109. doi: 10.3389/fpsyg.2018.01109. eCollection 2018.
7
Vocal intensity in speakers and singers.演讲者和歌手的发声强度。
J Acoust Soc Am. 1992 May;91(5):2936-46. doi: 10.1121/1.402929.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验