Valente Ana Rita S, Jesus Luis M T, Hall Andreia, Leahy Margaret
Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal; Department of Education (DE), University of Aveiro, Aveiro, Portugal.
Int J Lang Commun Disord. 2015 Jan-Feb;50(1):14-30. doi: 10.1111/1460-6984.12113. Epub 2014 Jun 11.
Event- and interval-based measurements are two different ways of computing frequency of stuttering. Interval-based methodology emerged as an alternative measure to overcome problems associated with reproducibility in the event-based methodology. No review has been made to study the effect of methodological factors in interval-based absolute reliability data or to compute the agreement between the two methodologies in terms of inter-judge, intra-judge and accuracy (i.e., correspondence between raters' scores and an established criterion).
To provide a review related to reproducibility of event-based and time-interval measurement, and to verify the effect of methodological factors (training, experience, interval duration, sample presentation order and judgment conditions) on agreement of time-interval measurement; in addition, to determine if it is possible to quantify the agreement between the two methodologies
METHODS & PROCEDURES: The first two authors searched for articles on ERIC, MEDLINE, PubMed, B-on, CENTRAL and Dissertation Abstracts during January-February 2013 and retrieved 495 articles. Forty-eight articles were selected for review. Content tables were constructed with the main findings.
Articles related to event-based measurements revealed values of inter- and intra-judge greater than 0.70 and agreement percentages beyond 80%. The articles related to time-interval measures revealed that, in general, judges with more experience with stuttering presented significantly higher levels of intra- and inter-judge agreement. Inter- and intra-judge values were beyond the references for high reproducibility values for both methodologies. Accuracy (regarding the closeness of raters' judgements with an established criterion), intra- and inter-judge agreement were higher for trained groups when compared with non-trained groups. Sample presentation order and audio/video conditions did not result in differences in inter- or intra-judge results. A duration of 5 s for an interval appears to be an acceptable agreement. Explanation for high reproducibility values as well as parameter choice to report those data are discussed.
CONCLUSIONS & IMPLICATIONS: Both interval- and event-based methodologies used trained or experienced judges for inter- and intra-judge determination and data were beyond the references for good reproducibility values. Inter- and intra-judge values were reported in different metric scales among event- and interval-based methods studies, making it unfeasible to quantify the agreement between the two methods.
基于事件和基于时间间隔的测量是计算口吃频率的两种不同方法。基于时间间隔的方法作为一种替代测量方法出现,以克服基于事件的方法中与可重复性相关的问题。尚未有研究探讨方法学因素对基于时间间隔的绝对可靠性数据的影响,也未计算两种方法在评判者间、评判者内以及准确性(即评分者分数与既定标准之间的一致性)方面的一致性。
对基于事件和基于时间间隔测量的可重复性进行综述,并验证方法学因素(培训、经验、时间间隔时长、样本呈现顺序和判断条件)对时间间隔测量一致性的影响;此外,确定是否有可能量化两种方法之间的一致性。
前两位作者于2013年1月至2月在教育资源信息中心(ERIC)、医学文献数据库(MEDLINE)、医学期刊数据库(PubMed)、B-on、考克兰系统评价中心数据库(CENTRAL)和论文摘要数据库中检索文章,共检索到495篇文章。选取48篇文章进行综述,并构建了包含主要研究结果的内容表。
与基于事件测量相关的文章显示,评判者间和评判者内的值大于0.70,一致性百分比超过80%。与基于时间间隔测量相关的文章表明,总体而言,对口吃有更多经验的评判者的评判者内和评判者间一致性水平显著更高。两种方法的评判者间和评判者内值均超过了高可重复性值的参考标准。与未受过培训的组相比,受过培训的组在准确性(关于评分者判断与既定标准的接近程度)、评判者内和评判者间一致性方面更高。样本呈现顺序和音频/视频条件并未导致评判者间或评判者内结果出现差异。5秒的时间间隔似乎是一个可接受的一致性时长。文中讨论了高可重复性值的解释以及报告这些数据的参数选择。
基于时间间隔和基于事件的方法在评判者间和评判者内的判定中均使用了受过培训或有经验的评判者,且数据超过了良好可重复性值的参考标准。在基于事件和基于时间间隔的方法研究中,评判者间和评判者内的值以不同的度量尺度报告,这使得量化两种方法之间的一致性变得不可行。