School of Medical Sciences, Faculty of Medicine, University of New South Wales, Sydney, Australia.
Duke University Division of Physical Therapy, Duke Clinical Research Institute, North Carolina, USA.
Braz J Phys Ther. 2020 Mar-Apr;24(2):177-184. doi: 10.1016/j.bjpt.2019.01.009. Epub 2019 Jan 30.
To determine the reliability, internal consistency, measurement error, convergent validity, and floor and ceiling effects of three quality assessment tools commonly used to evaluate the quality of diagnostic test accuracy studies in physical therapy. A secondary aim was to describe the quality of a sample of diagnostic accuracy studies.
50 studies were randomly selected from a comprehensive database of physical therapy-relevant diagnostic accuracy studies. Two reviewers independently rated each study with the Quality Assessment of Diagnostic Accuracy Studies (QUADAS), Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) and Diagnostic Accuracy Quality Scale (DAQS) tools in random sequence.
Only 7% of QUADAS items, 14% of QUADAS-2 items, and 33% of DAQS items had at least moderate inter-rater reliability (kappa>0.40). Internal consistency and convergent validity measures were acceptable (>0.70) in 33% and 50% of cases respectively. Floor or ceiling effects were not present in any tool. The quality of studies was mixed: most avoided case-control sampling strategies and used the same reference standard on all subjects, but many failed to enroll a consecutive or random sample of subjects or provide confidence intervals about estimates of diagnostic accuracy.
The QUADAS, QUADAS-2 and DAQS tools provide unreliable estimates of the quality of studies of diagnostic accuracy in physical therapy.
确定三种常用于评估物理治疗中诊断准确性研究质量的工具的可靠性、内部一致性、测量误差、收敛效度以及地板和天花板效应。次要目的是描述一组诊断准确性研究的质量。
从物理治疗相关诊断准确性研究的综合数据库中随机选择了 50 项研究。两位审查员以随机顺序分别使用《诊断准确性研究质量评估工具》(QUADAS)、《诊断准确性研究质量评估工具 2》(QUADAS-2)和《诊断准确性质量量表》(DAQS)工具对每项研究进行独立评估。
仅有 7%的 QUADAS 项目、14%的 QUADAS-2 项目和 33%的 DAQS 项目具有至少中度的观察者间可靠性(kappa>0.40)。内部一致性和收敛效度测量在 33%和 50%的情况下分别是可接受的(>0.70)。在任何工具中都不存在地板或天花板效应。研究的质量参差不齐:大多数研究避免了病例对照抽样策略,并在所有受试者中使用相同的参考标准,但许多研究未能招募连续或随机的受试者样本,或提供关于诊断准确性估计的置信区间。
QUADAS、QUADAS-2 和 DAQS 工具对物理治疗中诊断准确性研究的质量提供了不可靠的估计。