Higgins Janet R, Lin Feng-Chang, Evans James P
1American College of Medical Genetics and Genomics, Bethesda, MD USA.
2Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC USA.
Res Integr Peer Rev. 2016 Oct 10;1:13. doi: 10.1186/s41073-016-0021-8. eCollection 2016.
Plagiarism is common and threatens the integrity of the scientific literature. However, its detection is time consuming and difficult, presenting challenges to editors and publishers who are entrusted with ensuring the integrity of published literature.
In this study, the extent of plagiarism in manuscripts submitted to a major specialty medical journal was documented. We manually curated submitted manuscripts and deemed an article contained plagiarism if one sentence had 80 % of the words copied from another published paper. Commercial plagiarism detection software was utilized and its use was optimized.
In 400 consecutively submitted manuscripts, 17 % of submissions contained unacceptable levels of plagiarized material with 82 % of plagiarized manuscripts submitted from countries where English was not an official language. Using the most commonly employed commercial plagiarism detection software, sensitivity and specificity were studied with regard to the generated plagiarism score. The cutoff score maximizing both sensitivity and specificity was 15 % (sensitivity 84.8 % and specificity 80.5 %).
Plagiarism was a common occurrence among manuscripts submitted for publication to a major American specialty medical journal and most manuscripts with plagiarized material were submitted from countries in which English was not an official language. The use of commercial plagiarism detection software can be optimized by selecting a cutoff score that reflects desired sensitivity and specificity.
抄袭现象普遍,威胁着科学文献的完整性。然而,抄袭检测耗时且困难,给负责确保已发表文献完整性的编辑和出版商带来了挑战。
在本研究中,记录了提交给一家主要专业医学期刊的稿件中的抄袭程度。我们人工筛选提交的稿件,如果一篇文章中有一个句子80%的单词抄袭自另一篇已发表的论文,就判定该文章存在抄袭。使用了商业抄袭检测软件并对其使用进行了优化。
在连续提交的400篇稿件中,17%的提交稿件包含不可接受程度的抄袭材料,其中82%的抄袭稿件来自英语非官方语言的国家。使用最常用的商业抄袭检测软件,研究了生成的抄袭分数的敏感性和特异性。使敏感性和特异性最大化的临界分数为15%(敏感性84.8%,特异性80.5%)。
在美国一家主要专业医学期刊提交发表的稿件中,抄袭现象普遍,大多数有抄袭材料的稿件来自英语非官方语言的国家。通过选择反映所需敏感性和特异性的临界分数,可以优化商业抄袭检测软件的使用。