Barnhart Huiman X, Yow Eric, Crowley Anna Lisa, Daubert Melissa A, Rabineau Dawn, Bigelow Robert, Pencina Michael, Douglas Pamela S
Duke Clinical Research Institute, Duke University Medical Center, Durham, USA
Duke Clinical Research Institute, Duke University Medical Center, Durham, USA.
Stat Methods Med Res. 2016 Dec;25(6):2939-2958. doi: 10.1177/0962280214534651. Epub 2014 May 14.
Clinical core laboratories, such as Echocardiography core laboratories, are increasingly used in clinical studies with imaging outcomes as primary, secondary, or surrogate endpoints. While many factors contribute to the quality of measurements of imaging variables, an essential step in ensuring the value of imaging data includes formal assessment and control of reproducibility via intra-observer and inter-observer reliability. There are many different agreement/reliability indices in the literature. However, different indices may lead to different conclusions and it is not clear which index is the preferred choice as an overall indication of data quality and a tool for providing guidance on improving quality and reliability in a core lab setting. In this paper, we pre-specify the desirable characteristics of an agreement index for assessing and improving reproducibility in a core lab setting; we compare existing agreement indices in terms of these characteristics to choose a preferred index. We conclude that, among the existing indices reviewed, the coverage probability for assessing agreement is the preferred agreement index on the basis of computational simplicity, its ability for rapid identification of discordant measurements to provide guidance for review and retraining, and its consistent evaluation of data quality across multiple reviewers, populations, and continuous/categorical data.
临床核心实验室,如超声心动图核心实验室,越来越多地用于以影像学结果作为主要、次要或替代终点的临床研究。虽然许多因素会影响成像变量测量的质量,但确保成像数据价值的一个关键步骤包括通过观察者内和观察者间可靠性对可重复性进行正式评估和控制。文献中有许多不同的一致性/可靠性指标。然而,不同的指标可能会导致不同的结论,目前尚不清楚哪种指标作为数据质量的总体指标以及作为在核心实验室环境中提高质量和可靠性的指导工具是首选。在本文中,我们预先规定了用于评估和提高核心实验室环境中可重复性的一致性指标的理想特征;我们根据这些特征比较现有的一致性指标,以选择首选指标。我们得出结论,在所审查的现有指标中,基于计算的简单性、能够快速识别不一致测量以提供审查和再培训指导的能力,以及对多个审阅者、人群和连续/分类数据的数据质量进行一致评估,评估一致性的覆盖概率是首选的一致性指标。