Evidence-Based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, China.
Institute of Global Health, University of Geneva, Geneva 1202, Switzerland.
Chin Med J (Engl). 2023 Jun 20;136(12):1430-1438. doi: 10.1097/CM9.0000000000002713. Epub 2023 May 16.
This study aimed to develop a comprehensive instrument for evaluating and ranking clinical practice guidelines, named Scientific, Transparent and Applicable Rankings tool (STAR), and test its reliability, validity, and usability.
This study set up a multidisciplinary working group including guideline methodologists, statisticians, journal editors, clinicians, and other experts. Scoping review, Delphi methods, and hierarchical analysis were used to develop the STAR tool. We evaluated the instrument's intrinsic and interrater reliability, content and criterion validity, and usability.
STAR contained 39 items grouped into 11 domains. The mean intrinsic reliability of the domains, indicated by Cronbach's α coefficient, was 0.588 (95% confidence interval [CI]: 0.414, 0.762). Interrater reliability as assessed with Cohen's kappa coefficient was 0.774 (95% CI: 0.740, 0.807) for methodological evaluators and 0.618 (95% CI: 0.587, 0.648) for clinical evaluators. The overall content validity index was 0.905. Pearson's r correlation for criterion validity was 0.885 (95% CI: 0.804, 0.932). The mean usability score of the items was 4.6 and the median time spent to evaluate each guideline was 20 min.
The instrument performed well in terms of reliability, validity, and efficiency, and can be used for comprehensively evaluating and ranking guidelines.
本研究旨在开发一种全面的临床实践指南评估和排名工具,名为科学、透明和适用排名工具(STAR),并测试其可靠性、有效性和可用性。
本研究成立了一个多学科工作组,包括指南方法学家、统计学家、期刊编辑、临床医生和其他专家。使用范围综述、德尔菲法和层次分析法制定 STAR 工具。我们评估了该工具的内在和评分者间可靠性、内容和标准有效性以及可用性。
STAR 包含 39 个项目,分为 11 个领域。领域的内在可靠性(由 Cronbach's α 系数表示)的平均值为 0.588(95%置信区间[CI]:0.414,0.762)。方法评估者的评分者间可靠性(用 Cohen's kappa 系数评估)为 0.774(95% CI:0.740,0.807),临床评估者为 0.618(95% CI:0.587,0.648)。总体内容有效性指数为 0.905。标准有效性的 Pearson's r 相关系数为 0.885(95% CI:0.804,0.932)。项目的平均可用性评分为 4.6,评估每条指南的中位数时间为 20 分钟。
该工具在可靠性、有效性和效率方面表现良好,可用于全面评估和排名指南。