Department of Cognitive Sciences, United Arab Emirates University, Al Ain, 15551, UAE.
Department of Psychology, University of Cambridge, Cambridge, UK.
Behav Res Methods. 2020 Oct;52(5):1893-1905. doi: 10.3758/s13428-020-01353-z.
Indicators of letter frequency and similarity have long been available for Indo-European languages. They have not only been pivotal in controlling the design of experimental psycholinguistic studies seeking to determine the factors that underlie reading ability and literacy acquisition, but have also been useful for studies examining the more general aspects of human cognition. Despite their importance, however, such indicators are still not available for Modern Standard Arabic (MSA), a language that, by virtue of its orthographic system, presents an invaluable environment for the experimental investigation of visual word processing. This paper presents for the first time the frequencies of Arabic letters and their allographs based on a 40-million-word corpus, along with their similarity/confusability indicators in three domains: (1) the visual domain, based on human ratings; (2) the auditory domain, based on an analysis of the phonetic features of letter sounds; and (3) the motoric domain, based on an analysis of the stroke features used to write letters and their allographs. Taken together, the frequency and similarity of Arabic letters and their allographs in the visual and motoric domains, as well as the similarities among the letter sounds, will be useful for researchers interested in the processes underpinning orthographic processing, visual word recognition, reading, and literacy acquisition.
字母频率和相似性指标早已可用于印欧语言。它们不仅是控制旨在确定阅读能力和识字习得基础的实验心理语言学研究设计的关键因素,而且对于研究人类认知的更一般方面也很有用。然而,尽管它们很重要,但现代标准阿拉伯语(MSA)仍然没有这些指标,因为其正字法系统为视觉单词处理的实验研究提供了宝贵的环境。本文首次基于一个 4000 万词的语料库,提供了阿拉伯字母及其异体字的频率,以及它们在三个领域的相似性/混淆性指标:(1)视觉领域,基于人类评分;(2)听觉领域,基于对字母声音的语音特征的分析;以及(3)运动领域,基于用于书写字母及其异体字的笔画特征的分析。总的来说,阿拉伯字母及其异体字在视觉和运动领域的频率和相似性,以及字母声音之间的相似性,对于研究正字法处理、视觉单词识别、阅读和识字习得背后的过程的研究人员将是有用的。