Downs S H, Black N
Department of Public Health and Policy, London School of Hygiene and Tropical Medicine.
J Epidemiol Community Health. 1998 Jun;52(6):377-84. doi: 10.1136/jech.52.6.377.
To test the feasibility of creating a valid and reliable checklist with the following features: appropriate for assessing both randomised and non-randomised studies; provision of both an overall score for study quality and a profile of scores not only for the quality of reporting, internal validity (bias and confounding) and power, but also for external validity.
A pilot version was first developed, based on epidemiological principles, reviews, and existing checklists for randomised studies. Face and content validity were assessed by three experienced reviewers and reliability was determined using two raters assessing 10 randomised and 10 non-randomised studies. Using different raters, the checklist was revised and tested for internal consistency (Kuder-Richardson 20), test-retest and inter-rater reliability (Spearman correlation coefficient and sign rank test; kappa statistics), criterion validity, and respondent burden.
The performance of the checklist improved considerably after revision of a pilot version. The Quality Index had high internal consistency (KR-20: 0.89) as did the subscales apart from external validity (KR-20: 0.54). Test-retest (r 0.88) and inter-rater (r 0.75) reliability of the Quality Index were good. Reliability of the subscales varied from good (bias) to poor (external validity). The Quality Index correlated highly with an existing, established instrument for assessing randomised studies (r 0.90). There was little difference between its performance with non-randomised and with randomised studies. Raters took about 20 minutes to assess each paper (range 10 to 45 minutes).
This study has shown that it is feasible to develop a checklist that can be used to assess the methodological quality not only of randomised controlled trials but also non-randomised studies. It has also shown that it is possible to produce a checklist that provides a profile of the paper, alerting reviewers to its particular methodological strengths and weaknesses. Further work is required to improve the checklist and the training of raters in the assessment of external validity.
测试创建一份具备以下特征的有效且可靠的清单的可行性:适用于评估随机对照研究和非随机对照研究;不仅提供研究质量的总体评分,还提供报告质量、内部效度(偏倚和混杂因素)、效能以及外部效度的评分概况。
首先基于流行病学原理、综述以及现有的随机对照研究清单开发了一个试验版本。由三位经验丰富的评审人员评估表面效度和内容效度,并使用两名评分员对10项随机对照研究和10项非随机对照研究进行评估以确定信度。使用不同的评分员,对清单进行修订并测试其内部一致性(库德-理查森20系数)、重测信度和评分员间信度(斯皮尔曼相关系数和符号秩检验;kappa统计量)、效标效度以及应答者负担。
试验版本修订后,清单的性能有了显著提升。质量指数具有较高的内部一致性(KR-20:0.89),除外部效度外,各子量表的内部一致性也较高(KR-20:0.54)。质量指数的重测信度(r = 0.88)和评分员间信度(r = 0.75)良好。各子量表的信度从良好(偏倚)到较差(外部效度)不等。质量指数与现有的用于评估随机对照研究的既定工具高度相关(r = 0.90)。其在非随机对照研究和随机对照研究中的表现差异不大。评分员评估每篇论文大约需要20分钟(范围为10至45分钟)。
本研究表明,开发一份不仅可用于评估随机对照试验,还可用于评估非随机对照研究方法学质量的清单是可行的。研究还表明,有可能生成一份能提供论文概况的清单,提醒评审人员注意其特定的方法学优势和劣势。需要进一步开展工作来改进清单以及对评分员进行外部效度评估方面的培训。