• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用来源分析揭示论文工厂的科学论文。

Unveiling scientific articles from paper mills with provenance analysis.

机构信息

Artificial Intelligence Lab. Recod.ai, Institute of Computing, Universidade Estadual de Campinas, Campinas, São Paulo, Brazil.

Department of Computer Science, Loyola University Chicago, Chicago, Illinois, United States of America.

出版信息

PLoS One. 2024 Oct 30;19(10):e0312666. doi: 10.1371/journal.pone.0312666. eCollection 2024.

DOI:10.1371/journal.pone.0312666
PMID:39476003
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11524478/
Abstract

The increasing prevalence of fake publications created by paper mills poses a significant challenge to maintaining scientific integrity. While integrity analysts typically rely on textual and visual clues to identify fake articles, determining which papers merit further investigation can be akin to searching for a needle in a haystack, as these fake publications have non-related authors and are published on non-related venues. To address this challenge, we developed a new methodology for provenance analysis, which automatically tracks and groups suspicious figures and documents. Our approach groups manuscripts from the same paper mill by analyzing their figures and identifying duplicated and manipulated regions. These regions are linked and organized in a provenance graph, providing evidence of systematic production. We tested our solution on a paper mill dataset of hundreds of documents and also on a larger version of the dataset that deliberately included thousands of documents intentionally selected to distract our method. Our approach successfully identified and linked systematically produced articles on both datasets by pinpointing the figures they reused and manipulated from one another. The technique herein proposed offers a promising solution to identify fraudulent manuscripts, and it could be a valuable tool for supporting scientific integrity.

摘要

日益增多的由论文工厂制造的虚假出版物对维护科学诚信构成了重大挑战。虽然完整性分析人员通常依赖文本和视觉线索来识别虚假文章,但确定哪些论文值得进一步调查就像是在干草堆里找针一样,因为这些虚假出版物的作者之间没有关联,发表的刊物也没有关联。为了解决这一挑战,我们开发了一种新的溯源分析方法,可以自动跟踪和分组可疑的人物和文件。我们的方法通过分析论文中的图像并识别重复和操纵的区域,将来自同一论文工厂的手稿进行分组。这些区域在溯源图中链接并组织起来,提供了系统生产的证据。我们在一个由数百篇文档组成的论文工厂数据集上测试了我们的解决方案,还在一个更大的数据集上进行了测试,该数据集故意包含数千篇文档,这些文档是故意挑选出来以分散我们的方法的注意力的。我们的方法通过指出它们相互重复和操纵的图像,成功地识别和链接了这两个数据集上系统生成的文章。本文提出的技术为识别欺诈性手稿提供了一个有前途的解决方案,它可能是支持科学诚信的一个有价值的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/83dfb1538e5b/pone.0312666.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/079b86c148d6/pone.0312666.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/58b10a99c8e4/pone.0312666.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/d23b373de61e/pone.0312666.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/defb8b48cc91/pone.0312666.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/b8c0603ab285/pone.0312666.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/143cb684f001/pone.0312666.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/99ca5de42abf/pone.0312666.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/089cd6c44897/pone.0312666.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/651de2b95f00/pone.0312666.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/fbaaa8e34b23/pone.0312666.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/b42f8a670aa8/pone.0312666.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/ab50a5e7f0fb/pone.0312666.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/c43768b79369/pone.0312666.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/83dfb1538e5b/pone.0312666.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/079b86c148d6/pone.0312666.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/58b10a99c8e4/pone.0312666.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/d23b373de61e/pone.0312666.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/defb8b48cc91/pone.0312666.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/b8c0603ab285/pone.0312666.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/143cb684f001/pone.0312666.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/99ca5de42abf/pone.0312666.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/089cd6c44897/pone.0312666.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/651de2b95f00/pone.0312666.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/fbaaa8e34b23/pone.0312666.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/b42f8a670aa8/pone.0312666.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/ab50a5e7f0fb/pone.0312666.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/c43768b79369/pone.0312666.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ced/11524478/83dfb1538e5b/pone.0312666.g014.jpg

相似文献

1
Unveiling scientific articles from paper mills with provenance analysis.利用来源分析揭示论文工厂的科学论文。
PLoS One. 2024 Oct 30;19(10):e0312666. doi: 10.1371/journal.pone.0312666. eCollection 2024.
2
Threats to scholarly research integrity arising from paper mills: a rapid scoping review.论文工厂对学术研究诚信的威胁:快速范围综述。
Clin Rheumatol. 2022 Jul;41(7):2241-2248. doi: 10.1007/s10067-022-06198-9. Epub 2022 May 6.
3
Fake paper identification in the pool of withdrawn and rejected manuscripts submitted to Naunyn-Schmiedeberg's Archives of Pharmacology.提交给《瑙纽恩-施米德贝格药理学档案》的撤回和拒稿稿件库中的假论文识别
Naunyn Schmiedebergs Arch Pharmacol. 2024 Apr;397(4):2171-2181. doi: 10.1007/s00210-023-02741-w. Epub 2023 Oct 5.
4
Paper mill challenges: past, present, and future.造纸厂面临的挑战:过去、现在与未来。
J Clin Epidemiol. 2024 Dec;176:111549. doi: 10.1016/j.jclinepi.2024.111549. Epub 2024 Oct 9.
5
How Naunyn-Schmiedeberg's Archives of Pharmacology deals with fraudulent papers from paper mills.《瑙曼-施米德贝格药理学档案》如何应对来自论文工厂的伪造论文。
Naunyn Schmiedebergs Arch Pharmacol. 2021 Mar;394(3):431-436. doi: 10.1007/s00210-021-02056-8.
6
How to fight fake papers: a review on important information sources and steps towards solution of the problem.如何打击假论文:重要信息来源及解决问题步骤综述。
Naunyn Schmiedebergs Arch Pharmacol. 2024 Dec;397(12):9281-9294. doi: 10.1007/s00210-024-03272-8. Epub 2024 Jul 6.
7
Metadata analysis of retracted fake papers in Naunyn-Schmiedeberg's Archives of Pharmacology.撤回的《药理学文献档案》中的虚假论文的元数据分析。
Naunyn Schmiedebergs Arch Pharmacol. 2024 Jun;397(6):3995-4011. doi: 10.1007/s00210-023-02850-6. Epub 2023 Nov 23.
8
"Research paper mills": A factory outlet for dubious research.“研究论文工厂”:可疑研究的工厂直销店。
Indian J Med Ethics. 2024 Jul-Sep;IX(3):222-227. doi: 10.20529/IJME.2024.025.
9
Digital magic, or the dark arts of the 21 century-how can journals and peer reviewers detect manuscripts and publications from paper mills?数字魔法,或 21 世纪的黑暗艺术——期刊和同行评审人员如何发现来自论文工厂的稿件和出版物?
FEBS Lett. 2020 Feb;594(4):583-589. doi: 10.1002/1873-3468.13747. Epub 2020 Feb 17.
10
Detection of fake papers in the era of artificial intelligence.人工智能时代的假论文检测。
Diagnosis (Berl). 2023 Aug 17;10(4):390-397. doi: 10.1515/dx-2023-0090. eCollection 2023 Nov 1.

引用本文的文献

1
[Scientific fraud and dubious publication practices].[科学欺诈与可疑的出版行为]
Med Klin Intensivmed Notfmed. 2025 Aug 7. doi: 10.1007/s00063-025-01307-3.
2
Widespread misidentification of scanning electron microscope instruments in the peer-reviewed materials science and engineering literature.同行评审的材料科学与工程文献中对扫描电子显微镜仪器的广泛误认。
PLoS One. 2025 Jul 17;20(7):e0326754. doi: 10.1371/journal.pone.0326754. eCollection 2025.

本文引用的文献

1
Protection of the human gene research literature from contract cheating organizations known as research paper mills.保护人类基因研究文献免受被称为论文工厂的合同作弊组织的侵害。
Nucleic Acids Res. 2022 Nov 28;50(21):12058-12070. doi: 10.1093/nar/gkac1139.
2
SILA: a system for scientific image analysis.SILA:一个科学图像分析系统。
Sci Rep. 2022 Oct 31;12(1):18306. doi: 10.1038/s41598-022-21535-3.
3
Benchmarking Scientific Image Forgery Detectors.基准科学图像伪造检测器。
Sci Eng Ethics. 2022 Aug 9;28(4):35. doi: 10.1007/s11948-022-00391-4.
4
The fight against fake-paper factories that churn out sham science.打击制造虚假科学的造假工厂的斗争。
Nature. 2021 Mar;591(7851):516-519. doi: 10.1038/d41586-021-00733-5.
5
Digital magic, or the dark arts of the 21 century-how can journals and peer reviewers detect manuscripts and publications from paper mills?数字魔法,或 21 世纪的黑暗艺术——期刊和同行评审人员如何发现来自论文工厂的稿件和出版物?
FEBS Lett. 2020 Feb;594(4):583-589. doi: 10.1002/1873-3468.13747. Epub 2020 Feb 17.
6
Image Provenance Analysis at Scale.大规模图像来源分析
IEEE Trans Image Process. 2018 Aug 16. doi: 10.1109/TIP.2018.2865674.
7
Systematic fabrication of scientific images revealed.科学图像的系统制作方法已被揭示。
FEBS Lett. 2018 Sep;592(18):3027-3029. doi: 10.1002/1873-3468.13201. Epub 2018 Sep 1.
8
Automatic detection of image manipulations in the biomedical literature.生物医学文献中图像篡改的自动检测。
Cell Death Dis. 2018 Mar 14;9(3):400. doi: 10.1038/s41419-018-0430-3.
9
Towards a Systematic Screening Tool for Quality Assurance and Semiautomatic Fraud Detection for Images in the Life Sciences.迈向用于生命科学图像质量保证和半自动欺诈检测的系统筛选工具。
Sci Eng Ethics. 2017 Aug;23(4):1113-1128. doi: 10.1007/s11948-016-9841-7. Epub 2016 Nov 15.
10
The Prevalence of Inappropriate Image Duplication in Biomedical Research Publications.生物医学研究出版物中不当图像重复的发生率
mBio. 2016 Jun 7;7(3):e00809-16. doi: 10.1128/mBio.00809-16.