Joosse Aron Y, Kuscu Gökçe, Cassani Giovanni
Department of Cognitive Science and Artificial Intelligence, Tilburg School of Humanities and Digital Sciences, Tilburg University.
J Exp Psychol Learn Mem Cogn. 2025 Mar;51(3):478-495. doi: 10.1037/xlm0001345. Epub 2024 Sep 19.
We detail a successful attempt in modeling associations about the age, gender, and polarity of fictional characters based on their names alone. We started by collecting ratings through an online survey on a sample of annotated names from young-adult, children, and fan-fiction stories. We collected ratings over three semantic differentials (gender: male-female; age: old-young; polarity: evil-good) using a slider bar. First, we show that participants tend to agree with authors: names judged to better suit female/young/evil characters tend to be assigned to female/young/evil characters in the original stories. We then show that, in a series of computational studies, we can predict participants' ratings on the three attributes using a distributional semantic model which derives representations for both lexical and sublexical patterns. This attempt was successful for all names, including made-up ones, and using both a supervised and an unsupervised approach. The prediction supported by distributed representations is much better than that afforded by symbolic features such as letters and phonological features, also when accounting for the complexity of the feature spaces. Our results show that people interpret both known and novel names relying on lexical and sublexical patterns, which suggests the availability of systematic form-meaning mappings in everyday language use. This further lends credit to the hypothesis that language internal statistics can support systematic form-meaning associations which apply to both known and novel lexical items. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
我们详细介绍了一项仅基于虚构角色的名字来构建其年龄、性别和极性关联模型的成功尝试。我们首先通过在线调查收集评分,调查对象是来自青少年、儿童和同人小说故事中带注释名字的样本。我们使用滑动条收集了关于三个语义差异(性别:男-女;年龄:老-少;极性:邪恶-善良)的评分。首先,我们表明参与者倾向于与作者的判断一致:被认为更适合女性/年轻/邪恶角色的名字往往会被分配给原著故事中的女性/年轻/邪恶角色。然后我们表明,在一系列计算研究中,我们可以使用一种分布语义模型来预测参与者在这三个属性上的评分,该模型能为词汇和次词汇模式生成表征。这项尝试对所有名字都取得了成功,包括虚构的名字,并且使用了有监督和无监督两种方法。即使考虑到特征空间的复杂性,由分布式表征支持的预测也比由字母和语音特征等符号特征提供的预测要好得多。我们的结果表明,人们依靠词汇和次词汇模式来解读已知和新颖的名字,这表明在日常语言使用中存在系统的形式-意义映射。这进一步支持了这样一种假设,即语言内部统计可以支持适用于已知和新颖词汇项的系统形式-意义关联。(PsycInfo数据库记录(c)2025美国心理学会,保留所有权利)