Chang Li-Yun, Tseng Chien-Chih, Perfetti Charles A, Chen Hsueh-Chih
Department of Chinese as a Second Language, National Taiwan Normal University, Taipei, Taiwan.
Department of Educational Psychology and Counseling, National Taiwan Normal University, Taipei, Taiwan.
Behav Res Methods. 2022 Apr;54(2):632-648. doi: 10.3758/s13428-021-01611-8. Epub 2021 Aug 2.
This study developed and validated a Chinese pseudo-character/non-character producing system (CPN system) that can assist researchers in creating experimental materials using Chinese characters. Based on a large-scale dataset of 6097 characters, the CPN system provides researchers with precise Chinese orthographic information (structures and positions, radical frequency, number of strokes, number of radical-sharing neighbors, and position-based regularity) to create three types of experimental stimuli: pseudo-characters, semi non-characters, and whole non-characters. Featuring the position-based regularity of 446 radicals, the CPN system helps researchers to manipulate, or to control for, orthographic characteristics of radicals to study Chinese lexical processing. In two empirical validations for stimuli created by the system, Chinese-as-second-language learners (n = 79) and first-language users (n = 41), respectively, participated in a Chinese orthographic choice task in which participants compared two artificial characters and chose the one that more closely resembled a real Chinese character. Both validations demonstrate that highly proficient Chinese readers are better able to identify pseudo-characters, suggesting that the radical's position-based information impacts Chinese character identification to different extents. With the empirical support for the created stimuli, the system further affords researchers auto-generated outcomes with downloadable images and Excel sheets for creating customized stimuli, making material selection easy, efficient, and effective. This CPN system is the first large-scale, data-driven tool free for researchers who are interested in studies of written Chinese. CPN should benefit the field of Chinese orthographic processing, Chinese instruction, and cross-linguistic comparisons, providing a useful tool for studying Chinese lexical processing.
本研究开发并验证了一种中文伪字/非字生成系统(CPN系统),该系统可协助研究人员使用汉字创建实验材料。基于一个包含6097个汉字的大规模数据集,CPN系统为研究人员提供精确的中文正字法信息(结构与位置、部首频率、笔画数、共享部首的邻居数量以及基于位置的规律性),以创建三种类型的实验刺激:伪字、半非字和全非字。CPN系统具有446个部首基于位置的规律性,有助于研究人员操纵或控制部首的正字法特征,以研究中文词汇加工。在对该系统创建的刺激进行的两项实证验证中,汉语作为第二语言的学习者(n = 79)和第一语言使用者(n = 41)分别参与了一项中文正字法选择任务,参与者比较两个人造字并选择更像真实汉字的那个。两项验证均表明,中文阅读能力强的人更能识别伪字,这表明部首基于位置的信息对汉字识别有不同程度的影响。有了对所创建刺激的实证支持,该系统还为研究人员提供了自动生成的结果,带有可下载的图像和Excel表格,用于创建定制刺激,使材料选择变得轻松、高效且有效。这个CPN系统是首个面向对中文书写研究感兴趣的研究人员的大规模、数据驱动的免费工具。CPN应该会对中文正字法加工、中文教学和跨语言比较领域有所助益,为研究中文词汇加工提供一个有用的工具。