Suppr超能文献

使用人工智能工具在系统评价的一级筛查中可以像人类评估员一样准确。

Using an artificial intelligence tool can be as accurate as human assessors in level one screening for a systematic review.

机构信息

Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada.

Department of Anesthesiology and Pain Medicine, The Ottawa Hospital, Ottawa, ON, Canada.

出版信息

Health Info Libr J. 2024 Jun;41(2):136-148. doi: 10.1111/hir.12413. Epub 2021 Nov 18.

Abstract

BACKGROUND

Artificial intelligence (AI) offers a promising solution to expedite various phases of the systematic review process such as screening.

OBJECTIVE

We aimed to assess the accuracy of an AI tool in identifying eligible references for a systematic review compared to identification by human assessors.

METHODS

For the case study (a systematic review of knowledge translation interventions), we used a diagnostic accuracy design and independently assessed for eligibility a set of articles (n = 300) using human raters and the AI system DistillerAI (Evidence Partners, Ottawa, Canada). We analysed a series of 64 possible confidence levels for the AI's decisions and calculated several standard parameters of diagnostic accuracy for each.

RESULTS

When set to a lower AI confidence threshold of 0.1 or greater and an upper threshold of 0.9 or lower, DistillerAI made article selection decisions very similarly to human assessors. Within this range, DistillerAI made a decision on the majority of articles (93-100%), with a sensitivity of 1.0 and specificity ranging from 0.9 to 1.0.

CONCLUSION

DistillerAI appears to be accurate in its assessment of articles in a case study of 300 articles. Further experimentation with DistillerAI will establish its performance among other subject areas.

摘要

背景

人工智能(AI)为加快系统评价过程的各个阶段(如筛选)提供了一个有前途的解决方案。

目的

我们旨在评估 AI 工具在识别系统评价中合格文献方面的准确性,与人类评估员的识别进行比较。

方法

对于案例研究(知识转化干预措施的系统评价),我们使用了诊断准确性设计,使用人类评估员和 AI 系统 DistillerAI(Evidence Partners,渥太华,加拿大)独立评估了一组文章(n=300)的合格性。我们分析了 AI 决策的 64 个可能置信度水平系列,并为每个水平计算了几个诊断准确性的标准参数。

结果

当将 AI 置信度阈值设置为 0.1 或更高且上限阈值为 0.9 或更低时,DistillerAI 与人类评估员做出的文章选择决策非常相似。在这个范围内,DistillerAI 对大多数文章(93-100%)做出了决策,灵敏度为 1.0,特异性范围为 0.9 到 1.0。

结论

在 300 篇文章的案例研究中,DistillerAI 似乎在评估文章方面是准确的。进一步使用 DistillerAI 进行实验将确定其在其他领域的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验