Suppr超能文献

利用 Deduklick 减少系统综述负担:一种新颖、自动化、可靠且可解释的去重算法,以促进医学研究。

Reducing systematic review burden using Deduklick: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research.

机构信息

Risklick AG, Spin-Off, University of Bern, Bern, Switzerland.

CTU Bern, University of Bern, Bern, Switzerland.

出版信息

Syst Rev. 2022 Aug 17;11(1):172. doi: 10.1186/s13643-022-02045-9.

Abstract

BACKGROUND

Identifying and removing reference duplicates when conducting systematic reviews (SRs) remain a major, time-consuming issue for authors who manually check for duplicates using built-in features in citation managers. To address issues related to manual deduplication, we developed an automated, efficient, and rapid artificial intelligence-based algorithm named Deduklick. Deduklick combines natural language processing algorithms with a set of rules created by expert information specialists.

METHODS

Deduklick's deduplication uses a multistep algorithm of data normalization, calculates a similarity score, and identifies unique and duplicate references based on metadata fields, such as title, authors, journal, DOI, year, issue, volume, and page number range. We measured and compared Deduklick's capacity to accurately detect duplicates with the information specialists' standard, manual duplicate removal process using EndNote on eight existing heterogeneous datasets. Using a sensitivity analysis, we manually cross-compared the efficiency and noise of both methods.

DISCUSSION

Deduklick achieved average recall of 99.51%, average precision of 100.00%, and average F1 score of 99.75%. In contrast, the manual deduplication process achieved average recall of 88.65%, average precision of 99.95%, and average F1 score of 91.98%. Deduklick achieved equal to higher expert-level performance on duplicate removal. It also preserved high metadata quality and drastically reduced time spent on analysis. Deduklick represents an efficient, transparent, ergonomic, and time-saving solution for identifying and removing duplicates in SRs searches. Deduklick could therefore simplify SRs production and represent important advantages for scientists, including saving time, increasing accuracy, reducing costs, and contributing to quality SRs.

摘要

背景

在进行系统评价(SR)时,识别和去除参考文献重复仍然是作者手动使用引文管理器内置功能检查重复的主要耗时问题。为了解决与手动去重相关的问题,我们开发了一种名为 Deduklick 的自动化、高效、快速的基于人工智能的算法。Deduklick 将自然语言处理算法与一组由专家信息专家创建的规则相结合。

方法

Deduklick 的去重使用数据归一化的多步算法,计算相似度得分,并根据元数据字段(如标题、作者、期刊、DOI、年份、问题、卷和页码范围)识别唯一和重复的参考文献。我们使用 EndNote 在八个现有的异构数据集上测量和比较了 Deduklick 准确检测重复的能力与信息专家的标准、手动重复去除过程。使用敏感性分析,我们手动交叉比较了两种方法的效率和噪声。

讨论

Deduklick 的平均召回率为 99.51%,平均精度为 100.00%,平均 F1 分数为 99.75%。相比之下,手动去重过程的平均召回率为 88.65%,平均精度为 99.95%,平均 F1 分数为 91.98%。Deduklick 在去除重复方面达到了与专家水平相当甚至更高的性能。它还保持了较高的元数据质量,并大大减少了分析所花费的时间。Deduklick 为识别和去除 SR 搜索中的重复提供了一种高效、透明、符合人体工程学且节省时间的解决方案。因此,Deduklick 可以简化 SR 的制作,并为科学家们带来重要的优势,包括节省时间、提高准确性、降低成本和有助于制作高质量的 SR。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b42/9382798/59ce5be4dfd1/13643_2022_2045_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验