使用人工智能工具在系统评价的一级筛查中可以像人类评估员一样准确。

Using an artificial intelligence tool can be as accurate as human assessors in level one screening for a systematic review.

机构信息

Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada.

Department of Anesthesiology and Pain Medicine, The Ottawa Hospital, Ottawa, ON, Canada.

出版信息

Health Info Libr J. 2024 Jun;41(2):136-148. doi: 10.1111/hir.12413. Epub 2021 Nov 18.

DOI:10.1111/hir.12413

PMID:34792285

Abstract

BACKGROUND

Artificial intelligence (AI) offers a promising solution to expedite various phases of the systematic review process such as screening.

OBJECTIVE

We aimed to assess the accuracy of an AI tool in identifying eligible references for a systematic review compared to identification by human assessors.

METHODS

For the case study (a systematic review of knowledge translation interventions), we used a diagnostic accuracy design and independently assessed for eligibility a set of articles (n = 300) using human raters and the AI system DistillerAI (Evidence Partners, Ottawa, Canada). We analysed a series of 64 possible confidence levels for the AI's decisions and calculated several standard parameters of diagnostic accuracy for each.

RESULTS

When set to a lower AI confidence threshold of 0.1 or greater and an upper threshold of 0.9 or lower, DistillerAI made article selection decisions very similarly to human assessors. Within this range, DistillerAI made a decision on the majority of articles (93-100%), with a sensitivity of 1.0 and specificity ranging from 0.9 to 1.0.

CONCLUSION

DistillerAI appears to be accurate in its assessment of articles in a case study of 300 articles. Further experimentation with DistillerAI will establish its performance among other subject areas.

摘要

背景

人工智能（AI）为加快系统评价过程的各个阶段（如筛选）提供了一个有前途的解决方案。

目的

我们旨在评估 AI 工具在识别系统评价中合格文献方面的准确性，与人类评估员的识别进行比较。

方法

对于案例研究（知识转化干预措施的系统评价），我们使用了诊断准确性设计，使用人类评估员和 AI 系统 DistillerAI（Evidence Partners，渥太华，加拿大）独立评估了一组文章（n=300）的合格性。我们分析了 AI 决策的 64 个可能置信度水平系列，并为每个水平计算了几个诊断准确性的标准参数。