Suppr超能文献

方法学严谨的非随机研究偏倚风险工具的可靠性低,评估者负担高。

Methodologically rigorous risk of bias tools for nonrandomized studies had low reliability and high evaluator burden.

机构信息

George & Fay Yee Center for Healthcare Innovation, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba R3E 0T6, Canada; Department of Community Health Sciences, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba R3E 0T6, Canada.

George & Fay Yee Center for Healthcare Innovation, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba R3E 0T6, Canada; Department of Community Health Sciences, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba R3E 0T6, Canada.

出版信息

J Clin Epidemiol. 2020 Dec;128:140-147. doi: 10.1016/j.jclinepi.2020.09.033. Epub 2020 Sep 25.

Abstract

OBJECTIVE

To assess the real-world interrater reliability (IRR), interconsensus reliability (ICR), and evaluator burden of the Risk of Bias (RoB) in Nonrandomized Studies (NRS) of Interventions (ROBINS-I), and the ROB Instrument for NRS of Exposures (ROB-NRSE) tools.

STUDY DESIGN AND SETTING

A six-center cross-sectional study with seven reviewers (2 reviewer pairs) assessing the RoB using ROBINS-I (n = 44 NRS) or ROB-NRSE (n = 44 NRS). We used Gwet's AC statistic to calculate the IRR and ICR. To measure the evaluator burden, we assessed the total time taken to apply the tool and reach a consensus.

RESULTS

For ROBINS-I, both IRR and ICR for individual domains ranged from poor to substantial agreement. IRR and ICR on overall RoB were poor. The evaluator burden was 48.45 min (95% CI 45.61 to 51.29). For ROB-NRSE, the IRR and ICR for the majority of domains were poor, while the rest ranged from fair to perfect agreement. IRR and ICR on overall RoB were slight and poor, respectively. The evaluator burden was 36.98 min (95% CI 34.80 to 39.16).

CONCLUSIONS

We found both tools to have low reliability, although ROBINS-I was slightly higher. Measures to increase agreement between raters (e.g., detailed training, supportive guidance material) may improve reliability and decrease evaluator burden.

摘要

目的

评估风险偏倚(RoB)在非随机干预研究(NRS)中的评估者间可靠性(IRR)、一致性(ICR)和评价者负担,以及暴露 NRS 的 ROB 工具(ROB-NRSE)。

研究设计和设置

一项六中心横断面研究,有 7 名评估者(2 对评估者)使用 ROBINS-I(n=44 项 NRS)或 ROB-NRSE(n=44 项 NRS)评估 RoB。我们使用 Gwet 的 AC 统计来计算 IRR 和 ICR。为了衡量评价者负担,我们评估了应用工具和达成共识所花费的总时间。

结果

对于 ROBINS-I,各个领域的 IRR 和 ICR 从差到中等一致不等。整体 RoB 的 IRR 和 ICR 较差。评价者负担为 48.45 分钟(95%CI 45.61 至 51.29)。对于 ROB-NRSE,大多数领域的 IRR 和 ICR 较差,而其余领域的 IRR 和 ICR 则从公平到完美一致不等。整体 RoB 的 IRR 和 ICR 分别为轻微和较差。评价者负担为 36.98 分钟(95%CI 34.80 至 39.16)。

结论

我们发现这两种工具的可靠性都较低,尽管 ROBINS-I 略高一些。采取措施提高评估者之间的一致性(例如,详细的培训、支持性指导材料)可能会提高可靠性并降低评价者负担。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验