Rathbone John, Carter Matt, Hoffmann Tammy, Glasziou Paul
Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia.
Syst Rev. 2015 Jan 14;4(1):6. doi: 10.1186/2046-4053-4-6.
A major problem arising from searching across bibliographic databases is the retrieval of duplicate citations. Removing such duplicates is an essential task to ensure systematic reviewers do not waste time screening the same citation multiple times. Although reference management software use algorithms to remove duplicate records, this is only partially successful and necessitates removing the remaining duplicates manually. This time-consuming task leads to wasted resources. We sought to evaluate the effectiveness of a newly developed deduplication program against EndNote.
A literature search of 1,988 citations was manually inspected and duplicate citations identified and coded to create a benchmark dataset. The Systematic Review Assistant-Deduplication Module (SRA-DM) was iteratively developed and tested using the benchmark dataset and compared with EndNote's default one step auto-deduplication process matching on ('author', 'year', 'title'). The accuracy of deduplication was reported by calculating the sensitivity and specificity. Further validation tests, with three additional benchmarked literature searches comprising a total of 4,563 citations were performed to determine the reliability of the SRA-DM algorithm.
The sensitivity (84%) and specificity (100%) of the SRA-DM was superior to EndNote (sensitivity 51%, specificity 99.83%). Validation testing on three additional biomedical literature searches demonstrated that SRA-DM consistently achieved higher sensitivity than EndNote (90% vs 63%), (84% vs 73%) and (84% vs 64%). Furthermore, the specificity of SRA-DM was 100%, whereas the specificity of EndNote was imperfect (average 99.75%) with some unique records wrongly assigned as duplicates. Overall, there was a 42.86% increase in the number of duplicates records detected with SRA-DM compared with EndNote auto-deduplication.
The Systematic Review Assistant-Deduplication Module offers users a reliable program to remove duplicate records with greater sensitivity and specificity than EndNote. This application will save researchers and information specialists time and avoid research waste. The deduplication program is freely available online.
跨文献数据库检索产生的一个主要问题是重复引文的检索。去除这些重复项是确保系统评价者不会多次浪费时间筛选同一引文的一项重要任务。尽管参考文献管理软件使用算法来去除重复记录,但这只是部分成功,仍需要手动去除剩余的重复项。这项耗时的任务导致资源浪费。我们试图评估一个新开发的去重程序相对于EndNote的有效性。
对1988条引文进行文献检索,并进行人工检查,识别重复引文并编码,以创建一个基准数据集。使用基准数据集对系统评价助手去重模块(SRA-DM)进行迭代开发和测试,并与EndNote基于(“作者”、“年份”、“标题”)匹配的默认一步自动去重过程进行比较。通过计算灵敏度和特异度来报告去重的准确性。进行了另外三项基准文献检索,共4563条引文的进一步验证测试,以确定SRA-DM算法的可靠性。
SRA-DM的灵敏度(84%)和特异度(100%)优于EndNote(灵敏度51%,特异度99.83%)。对另外三项生物医学文献检索的验证测试表明,SRA-DM始终比EndNote具有更高的灵敏度(90%对63%)、(84%对73%)和(84%对64%)。此外,SRA-DM的特异度为100%,而EndNote的特异度并不完美(平均99.75%),一些独特记录被错误地指定为重复项。总体而言,与EndNote自动去重相比,SRA-DM检测到的重复记录数量增加了42.86%。
系统评价助手去重模块为用户提供了一个可靠的程序,用于去除重复记录,其灵敏度和特异度均高于EndNote。该应用程序将节省研究人员和信息专家的时间,并避免研究浪费。该去重程序可在网上免费获取。