Wang Yixia, Wang Yanxue, Chen Qi, Keuleers Emmanuel
Tilburg University, Warandelaan 2, Tilburg, 5037, AB, Netherlands.
South China Normal University, Guangzhou, China.
Behav Res Methods. 2025 Jun 23;57(7):206. doi: 10.3758/s13428-025-02701-7.
This paper presents the Simplified Chinese Lexicon Project (SCLP), which collects lexical decision data for all 8105 characters in the List of Commonly Used Standard Chinese Characters and for 4864 pseudocharacters, which were generated using a novel method that leveraged the hierarchical nature of Chinese characters. We compared the collected data to existing megastudies on Chinese characters, and found that the newly collected data performed similarly in terms of reliability. The comprehensive coverage of simplified Chinese characters in the present study added to the existing studies by allowing for a more fine-grained investigation of the effects of a variety of character attributes on visual processing. We illustrated these advantages by performing virtual experiments on visual complexity and on the interplay between neighborhood size and regularity. Our results indicated that characters with higher visual complexity were harder to recognize, in line with previous findings, while regular characters took longer to process when the neighborhood size was small. In addition, we present a new evaluation of the interaction between character frequency and subcomponent frequency, resulting in a three-way interaction among character frequency, radical frequency, and residual component frequency. Extending the investigation of subcomponent frequency to the analysis of pseudocharacters, we found that the interaction of radical frequency and residual component frequency also modulated pseudocharacter rejection. To support researchers in conducting behavioral experiments or statistical modeling, we provide both trial-level data and experiment materials.
本文介绍了简体中文字典项目(SCLP),该项目收集了《通用规范汉字表》中所有8105个汉字以及4864个伪汉字的词汇判断数据,这些伪汉字是使用一种利用汉字层级结构的新方法生成的。我们将收集到的数据与现有的关于汉字的大型研究进行了比较,发现新收集的数据在可靠性方面表现相似。本研究对简体汉字的全面覆盖通过允许对各种汉字属性对视觉处理的影响进行更细致的研究,为现有研究增添了内容。我们通过对视觉复杂性以及邻域大小与规则性之间的相互作用进行虚拟实验,展示了这些优势。我们的结果表明,视觉复杂性较高的汉字更难识别,这与先前的研究结果一致,而当邻域大小较小时,规则汉字的处理时间更长。此外,我们对汉字频率与子部件频率之间的相互作用进行了新的评估,结果得出了汉字频率、部首频率和剩余部件频率之间的三向相互作用。将子部件频率的研究扩展到伪汉字的分析中,我们发现部首频率和剩余部件频率的相互作用也调节了伪汉字的拒绝。为了支持研究人员进行行为实验或统计建模,我们提供了试验级数据和实验材料。