Suppr超能文献

复数形式词库:通过分布语义学揭示英语名词复数形式中的语义簇。

The pluralization palette: unveiling semantic clusters in English nominal pluralization through distributional semantics.

作者信息

Shafaei-Bajestan Elnaz, Moradipour-Tari Masoumeh, Uhrig Peter, Baayen R Harald

机构信息

Department of General and Computational Linguistics, University of Tübingen, Wilhelmstraße 19, Tübingen, 72074 Baden-Württemberg Germany.

Department of English and American Studies, Friedrich-Alexander-Universität Erlangen-Nürnberg, Bismarckstraße 1, Erlangen, 91054 Bayern Germany.

出版信息

Morphology (Dordr). 2024;34(4):369-413. doi: 10.1007/s11525-024-09428-9. Epub 2024 Jul 12.

Abstract

UNLABELLED

Using distributional semantics, we show that English nominal pluralization exhibits semantic clusters. For instance, the change in semantic space from singulars to plurals differs depending on whether a word denotes, e.g., a fruit, or an animal. Languages with extensive noun classes such as Swahili and Kiowa distinguish between these kind of words in their morphology. In English, even though not marked morphologically, plural semantics actually also varies by semantic class. A semantically informed method, CosClassAvg, is introduced that is compared to two other methods, one implementing a fixed shift from singular to plural, and one creating plural vectors from singular vectors using a linear mapping (FRACSS). Compared to FRACSS, CosClassAvg predicted plural vectors that were more similar to the corpus-extracted plural vectors in terms of vector length, but somewhat less similar in terms of orientation. Both FRACSS and CosClassAvg outperform the method using a fixed shift vector to create plural vectors, which does not do justice to the intricacies of English plural semantics. A computational modeling study revealed that the observed difference between the plural semantics generated by these three methods carries over to how well a computational model of the listener can understand previously unencountered plural forms. Among all methods, CosClassAvg provides a good balance for the trade-off between productivity (being able to understand novel plural forms) and faithfulness to corpus-extracted plural vectors (i.e., understanding the particulars of the meaning of a given plural form).

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s11525-024-09428-9.

摘要

未标注

使用分布语义学,我们表明英语名词复数化呈现出语义簇。例如,从单数到复数的语义空间变化因单词所表示的是水果还是动物等而有所不同。像斯瓦希里语和基奥瓦语这样具有广泛名词类别的语言在其形态学中区分这类单词。在英语中,尽管在形态上没有标记,但复数语义实际上也因语义类别而异。我们引入了一种基于语义的方法CosClassAvg,并将其与另外两种方法进行比较,一种方法是从单数到复数实现固定偏移,另一种方法是使用线性映射(FRACSS)从单数向量创建复数向量。与FRACSS相比,CosClassAvg预测的复数向量在向量长度方面与语料库提取的复数向量更相似,但在方向方面相似度稍低。FRACSS和CosClassAvg都优于使用固定偏移向量来创建复数向量的方法,该方法无法公正地处理英语复数语义的复杂性。一项计算建模研究表明,这三种方法生成的复数语义之间观察到的差异会影响听者的计算模型对以前未遇到的复数形式的理解程度。在所有方法中,CosClassAvg在生产率(能够理解新的复数形式)和对语料库提取的复数向量的忠实度(即理解给定复数形式的具体含义)之间的权衡上提供了良好的平衡。

补充信息

在线版本包含可在10.1007/s11525 - 0x4 - 09428 - 9获取的补充材料。 (注:原文中“10.1007/s11525 - 0x4 - 09428 - 9”疑似有误,推测应为“10.1007/s11525 - 024 - 09428 - 9”,译文按推测正确内容翻译)

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a63/11554741/124cd991d698/11525_2024_9428_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验