Department of Communication Sciences and Disorders, University of Central Florida, Room 106, Health and Public Affairs II, Orlando, FL, 32816-2215, USA.
Division of Speech and Hearing Sciences, University of Hong Kong, Pok Fu Lam, Hong Kong SAR.
Behav Res Methods. 2019 Jun;51(3):1131-1144. doi: 10.3758/s13428-018-1043-6.
This article reports the construction of a multimodal annotated database of spoken discourse and co-verbal gestures by native healthy speakers of Cantonese and individuals with language impairment: the Cantonese AphasiaBank. This corpus was established as a foundation for aphasiologists and clinicians to use in designing and conducting research investigations into theoretical and clinical issues related to acquired language disorders in Chinese. Details in terms of the purpose, structure, and levels of annotation of the database (containing part-of-speech-annotated orthographic transcripts with Romanization and the corresponding videos) are described. The discussion presents the challenges of building a spoken database of a language that is not linguistically well-researched and that does not have a standardized written form for many of its lexical items, as well as presenting how these issues were addressed. Most importantly, the article highlights the potential of Cantonese AphasiaBank as a powerful research tool for linguists and psycholinguists.
本文报告了一个多模态标注的口语语料库和共视手势语料库的构建,该语料库由母语为粤语的健康说话者和语言障碍者组成:粤语失语症语料库。该语料库为失语症学家和临床医生提供了一个基础,用于设计和开展与汉语获得性语言障碍相关的理论和临床问题的研究。本文详细描述了该数据库的目的、结构和标注层次(包含带罗马化的词性标注的正字法转写和相应的视频)。讨论介绍了构建一种语言的口语数据库所面临的挑战,这种语言在语言学上研究不足,其词汇中的许多词项都没有标准的书写形式,以及如何解决这些问题。最重要的是,本文强调了粤语失语症语料库作为语言学家和心理语言学家的有力研究工具的潜力。