Institute of Neuroscience and Physiology, Speech and Language Pathology Unit, Sahlgrenska Academy, University of Gothenburg , Gothenburg, Sweden.
School of Medical Sciences, Division of Dentistry, The University of Manchester, Manchester , Greater Manchester, UK.
Clin Linguist Phon. 2021 Feb 1;35(2):138-153. doi: 10.1080/02699206.2020.1758793. Epub 2020 May 6.
The consequence of differing levels of agreement across raters is rarely studied. Subsequently, knowledge is limited on how number of raters affects the outcome. The present study aimed to examine the impact on pre-linguistic outcome classifications of 12-month-old infants when using four raters compared to three. Thirty experienced Speech and Language Therapists (SLTs) from five countries assessed 20 minute video recordings of four 12-month-old infants during a play session with a parent. One recording was assessed twice. A naturalistic listening method in real time was used. This involved: (1) assessing, each syllable as canonical or non-canonical, and (2) following the recording, assessing if the infant was babbling canonically and listing the syllables the infant produced with command. The impact that four raters had on outcome, compared to three, was explored by classifying the outcome based on all possible combinations of three raters and determining the frequency that the outcome assessment changed when a fourth assessor was added. Results revealed that adding a fourth rater had a minimal impact on canonical babbling ratio assessment. Presence/absence of canonical babbling and size of consonant inventory showed a negligible impact on three out of four recordings, whereas the size of syllable inventory and presence/absence of canonical babbling was minimally affected in one recording by adding a fourth rater. In conclusion, adding a forth rater in assessment of pre-linguistic utterances in 12-month-old infants with naturalistic assessment in real time does not affect outcome classifications considerably. Thus, using three raters, as opposed to four, is recommended.
不同评分者之间一致性程度的结果很少被研究。因此,关于评分者人数如何影响结果的知识有限。本研究旨在考察与使用三个评分者相比,当使用四个评分者时,对 12 个月大婴儿的语言前阶段结果分类的影响。来自五个国家的 30 名经验丰富的言语和语言治疗师(SLTs)在与父母的游戏过程中评估了四个 12 个月大婴儿的 20 分钟视频记录。一个记录被评估了两次。使用实时自然聆听方法进行评估。这涉及:(1)评估每个音节是否为典型音节或非典型音节,以及(2)在记录结束后,评估婴儿是否以典型方式进行咿呀学语,并列出婴儿在指令下发出的音节。通过基于三个评分者的所有可能组合对结果进行分类,并确定当添加第四个评分者时结果评估发生变化的频率,来探索四个评分者对结果的影响,而不是三个评分者。结果表明,添加第四个评分者对典型咿呀学语比例评估的影响很小。在四个记录中的三个中,存在/不存在典型咿呀学语和辅音库的大小对结果评估几乎没有影响,而在一个记录中,添加第四个评分者对音节库的大小和存在/不存在典型咿呀学语的影响很小。总之,在实时自然评估中,在评估 12 个月大婴儿的语言前发音时,添加第四个评分者不会对结果分类产生显著影响。因此,建议使用三个评分者,而不是四个。