Trott Sean, Bergen Benjamin, Wittenberg Eva
Department of Cognitive Science, UC San Diego, San Diego, United States.
Department of Linguistics, UC San Diego, San Diego, United States.
Lang Resour Eval. 2022 Nov 28:1-25. doi: 10.1007/s10579-022-09619-y.
Speakers enjoy considerable flexibility in how they refer to a given referent--referring expressions can vary in their form (e.g., "she" vs. "the cat"), their length (e.g., "the (big) (orange) cat"), and more. What factors drive a speaker's decisions about how to refer, and how do these decisions shape a comprehender's ability to resolve the intended referent? Answering either question presents a methodological challenge; researchers must strike a balance between experimental control and ecological validity. In this paper, we introduce the SCARFS (Spontaneous, Controlled Acts of Reference between Friends and Strangers) Database: a corpus of approximately 20,000 English nominal referring expressions (NREs), produced in the context of a communication game. For each NRE, the corpus lists the concept the speaker was trying to convey (from a set of 471 possible target concepts), formal properties of the NRE (e.g., its length), the relationship between the interlocutors (i.e., friend vs. stranger), and the communicative outcome (i.e., whether the expression was successfully resolved). Researchers from diverse disciplines may use this resource to answer questions about how speakers refer and how comprehenders resolve their intended referent--as well as other fundamental questions about dialogic speech, such as whether and how speakers tailor their utterances to the identity of their interlocutor, how second-degree associations are generated, and the predictors of communicative success.
The online version contains supplementary material available at 10.1007/s10579-022-09619-y.
说话者在指代给定的指称对象时具有相当大的灵活性——指代表达在形式(例如,“她”与“那只猫”)、长度(例如,“那只(大的)(橙色的)猫”)等方面会有所不同。哪些因素驱动说话者做出指代方式的决定,以及这些决定如何塑造理解者解析预期指称对象的能力?回答这两个问题中的任何一个都面临方法上的挑战;研究人员必须在实验控制和生态效度之间取得平衡。在本文中,我们介绍了SCARFS(朋友和陌生人之间自发、可控的指称行为)数据库:一个在交流游戏背景下产生的约20000个英语名词性指代表达(NRE)的语料库。对于每个NRE,语料库列出了说话者试图传达的概念(从471个可能的目标概念集合中选取)、NRE的形式属性(例如其长度)、对话者之间的关系(即朋友与陌生人)以及交流结果(即该表达是否被成功解析)。来自不同学科的研究人员可以利用这个资源来回答关于说话者如何指代以及理解者如何解析其预期指称对象的问题——以及关于对话性言语的其他基本问题,比如说话者是否以及如何根据对话者的身份调整他们的话语、二级联想是如何产生的以及交流成功的预测因素。
在线版本包含可在10.1007/s10579-022-09619-y获取的补充材料。