Berk R A
Am J Ment Defic. 1979 Mar;83(5):460-72.
Sixteen indices of interobserver agreement and six methods for estimating coefficients of interobserver reliability were critiqued. The agreement statistics were found to be imprecise, limited psychometrically, and relatively inflexible in terms of the diverse categorical and quantitative data sets typically encountered in mental retardation research. Five of the reliability statistics produced precise estimates of agreement, yet possessed similar limitations. Only the intraclass correlation--generalizability theory approach seemed to offer the precision, comprehensiveness, and flexibility required to deal with the complexity of reliability assessment. A basic generalizability model was described and illustrated with group and single-subject research data.
对16种观察者间一致性指标和6种估计观察者间信度系数的方法进行了批判。结果发现,一致性统计不精确,心理测量学上有局限性,并且在智力迟钝研究中通常遇到的各种分类和定量数据集方面相对缺乏灵活性。5种信度统计产生了精确的一致性估计,但也有类似的局限性。只有组内相关——概化理论方法似乎提供了处理信度评估复杂性所需的精确性、全面性和灵活性。描述了一个基本的概化模型,并用组研究和单受试者研究数据进行了说明。