Suppr超能文献

本福特定律的应用:检测有伪造数据的科学论文的有效工具?:一项针对已证实的伪造文章与对照组的案例研究。

Application of Benford's law: a valuable tool for detecting scientific papers with fabricated data? : A case study using proven falsified articles against a comparison group.

作者信息

Hüllemann S, Schüpfer G, Mauch J

机构信息

Department of Anesthesiology, Luzerner Kantonsspital, 6000, Lucerne 16, Switzerland.

出版信息

Anaesthesist. 2017 Oct;66(10):795-802. doi: 10.1007/s00101-017-0333-1.

Abstract

BACKGROUND

In naturally occurring numbers the frequencies of digits 1-9 in the leading position are counterintuitively distributed because the frequencies of occurrence are unequal. Benford-Newcomb's law describes the expected distribution of these frequencies. It was previously shown that known fraudulent articles consistently violated this law.

OBJECTIVE

To compare the features of 12 known fraudulent articles from a single Japanese author to the features of 13 articles in the same research field from other Japanese authors, published during the same time period and identified with a Medline database search.

RESULTS

All 25 articles were assessed to determine whether the data violated the law. Formulas provided by the law were used to determine the frequencies of occurrence for the first two leading digits in manually extracted numbers. It was found that all the known fraudulent papers violated the law and 6 of the 13 articles used for comparison followed the law. Assuming that the articles in the comparison group were not falsified or fabricated, the sensitivity of assessing articles with Benford-Newcomb's law was 100% (95% confidence interval CI: 73.54-100%) but the specificity was only 46.15% (95% CI: 19.22-74.87%) and the positive predictive value was 63.16% (95% CI: 38.36-83.71%).

CONCLUSION

All 12 of the known falsified articles violated Benford-Newcomb's law, which indicated that this analysis had a high sensitivity. The low specificity of the assessment may be explained by the assumptions made about the articles identified for comparison. Violations of Benford-Newcomb's law about the frequencies of the leading digits cannot serve as proof of falsification but they may provide a basis for deeper discussions between the editor and author about a submitted work.

摘要

背景

在自然数中,数字1 - 9在首位出现的频率分布不符合直觉,因为出现频率并不相等。本福德 - 纽科姆定律描述了这些频率的预期分布。此前研究表明,已知的欺诈性文章始终违反该定律。

目的

将一位日本作者的12篇已知欺诈性文章的特征与同一研究领域中其他日本作者在同一时期发表的、通过医学文献数据库检索确定的13篇文章的特征进行比较。

结果

对所有25篇文章进行评估,以确定数据是否违反该定律。使用该定律提供的公式来确定手动提取数字中前两位数字的出现频率。结果发现,所有已知的欺诈性论文均违反该定律,而用于比较的13篇文章中有6篇符合该定律。假设比较组中的文章未被篡改或伪造,使用本福德 - 纽科姆定律评估文章的敏感性为100%(95%置信区间CI:73.54 - 100%),但特异性仅为46.15%(95%CI:19.22 - 74.87%),阳性预测值为63.16%(95%CI:38.36 - 83.71%)。

结论

所有12篇已知的伪造文章均违反本福德 - 纽科姆定律,这表明该分析具有较高的敏感性。评估的低特异性可能是由于对用于比较的文章所做的假设。关于首位数字频率违反本福德 - 纽科姆定律不能作为伪造证据,但它们可能为编辑和作者之间就提交的作品进行更深入讨论提供基础。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验