Suppr超能文献

似曾相识——对医学在线数据库(Medline)中重复引用的一项研究。

Déjà vu--a study of duplicate citations in Medline.

作者信息

Errami Mounir, Hicks Justin M, Fisher Wayne, Trusty David, Wren Jonathan D, Long Tara C, Garner Harold R

机构信息

UT Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas TX 75390-9185, USA.

出版信息

Bioinformatics. 2008 Jan 15;24(2):243-9. doi: 10.1093/bioinformatics/btm574. Epub 2007 Dec 1.

Abstract

MOTIVATION

Duplicate publication impacts the quality of the scientific corpus, has been difficult to detect, and studies this far have been limited in scope and size. Using text similarity searches, we were able to identify signatures of duplicate citations among a body of abstracts.

RESULTS

A sample of 62,213 Medline citations was examined and a database of manually verified duplicate citations was created to study author publication behavior. We found that 0.04% of the citations with no shared authors were highly similar and are thus potential cases of plagiarism. 1.35% with shared authors were sufficiently similar to be considered a duplicate. Extrapolating, this would correspond to 3500 and 117,500 duplicate citations in total, respectively.

AVAILABILITY

eTBLAST, an automated citation matching tool, and Déjà vu, the duplicate citation database, are freely available at http://invention.swmed.edu/ and http://spore.swmed.edu/dejavu

摘要

动机

重复发表会影响科学文献的质量,且难以被发现,到目前为止的相关研究在范围和规模上都很有限。通过文本相似性搜索,我们能够在一批摘要中识别出重复引用的特征。

结果

我们检查了62213条Medline引用样本,并创建了一个经人工验证的重复引用数据库,以研究作者的发表行为。我们发现,0.04%没有共同作者的引用相似度极高,因此可能存在抄袭情况。1.35%有共同作者的引用相似度足以被视为重复。据此推断,这分别对应总共3500条和117500条重复引用。

可用性

自动引用匹配工具eTBLAST和重复引用数据库Déjà vu可在http://invention.swmed.edu/和http://spore.swmed.edu/dejavu上免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验