Suppr超能文献

自动化系统评价中的偏倚风险评估:人类研究人员与机器学习系统的实时混合方法比较。

Automating risk of bias assessment in systematic reviews: a real-time mixed methods comparison of human researchers to a machine learning system.

机构信息

Division for Health Services, Norwegian Institute of Public Health, Postboks 222 Skøyen, 0213, Oslo, Norway.

Facultad de Cultura Física, Deporte y Recreación, Cra. 9 #51-11, Bogotá, Colombia.

出版信息

BMC Med Res Methodol. 2022 Jun 8;22(1):167. doi: 10.1186/s12874-022-01649-y.

Abstract

BACKGROUND

Machine learning and automation are increasingly used to make the evidence synthesis process faster and more responsive to policymakers' needs. In systematic reviews of randomized controlled trials (RCTs), risk of bias assessment is a resource-intensive task that typically requires two trained reviewers. One function of RobotReviewer, an off-the-shelf machine learning system, is an automated risk of bias assessment.

METHODS

We assessed the feasibility of adopting RobotReviewer within a national public health institute using a randomized, real-time, user-centered study. The study included 26 RCTs and six reviewers from two projects examining health and social interventions. We randomized these studies to one of two RobotReviewer platforms. We operationalized feasibility as accuracy, time use, and reviewer acceptability. We measured accuracy by the number of corrections made by human reviewers (either to automated assessments or another human reviewer's assessments). We explored acceptability through group discussions and individual email responses after presenting the quantitative results.

RESULTS

Reviewers were equally likely to accept judgment by RobotReviewer as each other's judgement during the consensus process when measured dichotomously; risk ratio 1.02 (95% CI 0.92 to 1.13; p = 0.33). We were not able to compare time use. The acceptability of the program by researchers was mixed. Less experienced reviewers were generally more positive, and they saw more benefits and were able to use the tool more flexibly. Reviewers positioned human input and human-to-human interaction as superior to even a semi-automation of this process.

CONCLUSION

Despite being presented with evidence of RobotReviewer's equal performance to humans, participating reviewers were not interested in modifying standard procedures to include automation. If further studies confirm equal accuracy and reduced time compared to manual practices, we suggest that the benefits of RobotReviewer may support its future implementation as one of two assessors, despite reviewer ambivalence. Future research should study barriers to adopting automated tools and how highly educated and experienced researchers can adapt to a job market that is increasingly challenged by new technologies.

摘要

背景

机器学习和自动化技术越来越多地被用于加快证据综合过程,并使其更能满足政策制定者的需求。在随机对照试验(RCT)的系统评价中,偏倚风险评估是一项资源密集型任务,通常需要两名经过培训的评审员。RobotReviewer 是一种现成的机器学习系统,其功能之一是自动进行偏倚风险评估。

方法

我们使用一项随机、实时、以用户为中心的研究,评估了在一家国家公共卫生机构中采用 RobotReviewer 的可行性。该研究包括来自两个项目的六名评审员,这两个项目分别评估了卫生和社会干预措施的 26 项 RCT。我们将这些研究随机分配到两个 RobotReviewer 平台之一。我们将可行性定义为准确性、时间使用和评审员的可接受性。我们通过人类评审员(对自动评估或另一位人类评审员的评估进行更正的数量)来衡量准确性。我们通过小组讨论和呈现定量结果后的个人电子邮件回复来探索可接受性。

结果

在共识过程中,评审员在二分法测量时,对 RobotReviewer 的判断与对彼此判断的接受程度相同;风险比 1.02(95%置信区间 0.92 至 1.13;p=0.33)。我们无法比较时间使用。研究人员对该程序的接受程度不一。经验较少的评审员通常更为积极,他们认为该程序具有更多的益处,并且能够更灵活地使用该工具。评审员认为人工输入和人际互动优于该过程的半自动化。

结论

尽管评审员看到了 RobotReviewer 与人类表现相当的证据,但他们对修改标准程序以纳入自动化并不感兴趣。如果进一步的研究证实其准确性与手动操作相当,并且时间更短,我们建议尽管评审员存在矛盾情绪,但 RobotReviewer 的优势可能支持将其作为两名评估员之一的未来实施。未来的研究应研究采用自动化工具的障碍,以及受过高等教育和经验丰富的研究人员如何适应新技术日益挑战的就业市场。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74b0/9175313/621c9904dbce/12874_2022_1649_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验