• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

算法识别发表的比值与其报告的置信区间和 P 值之间的差异。

Algorithmic identification of discrepancies between published ratios and their reported confidence intervals and P-values.

机构信息

Arthritis and Clinical Immunology Research Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma 73104-5005.

Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center.

出版信息

Bioinformatics. 2018 May 15;34(10):1758-1766. doi: 10.1093/bioinformatics/btx811.

DOI:10.1093/bioinformatics/btx811
PMID:29309530
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5946902/
Abstract

MOTIVATION

Studies, mostly from the operations/management literature, have shown that the rate of human error increases with task complexity. What is not known is how many errors make it into the published literature, given that they must slip by peer-review. By identifying paired, dependent values within text for reported calculations of varying complexity, we can identify discrepancies, quantify error rates and identify mitigating factors.

RESULTS

We extracted statistical ratios from MEDLINE abstracts (hazard ratio, odds ratio, relative risk), their 95% CIs, and their P-values. We re-calculated the ratios and P-values using the reported CIs. For comparison, we also extracted percent-ratio pairs, one of the simplest calculation tasks. Over 486 000 published values were found and analyzed for discrepancies, allowing for rounding and significant figures. Per reported item, discrepancies were less frequent in percent-ratio calculations (2.7%) than in ratio-CI and P-value calculations (5.6-7.5%), and smaller discrepancies were more frequent than large ones. Systematic discrepancies (multiple incorrect calculations of the same type) were higher for more complex tasks (14.3%) than simple ones (6.7%). Discrepancy rates decreased with increasing journal impact factor (JIF) and increasing number of authors, but with diminishing returns and JIF accounting for most of the effect. Approximately 87% of the 81 937 extracted P-values were ≤ 0.05.

CONCLUSION

Using a simple, yet accurate, approach to identifying paired values within text, we offer the first quantitative evaluation of published error frequencies within these types of calculations.

CONTACT

jonathan-wren@omrf.org or jdwren@gmail.com.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

研究主要来自运营/管理文献,表明随着任务复杂性的增加,人为错误的发生率也会增加。目前尚不清楚有多少错误会出现在发表的文献中,因为它们必须通过同行评审。通过在报告的计算中识别文本内配对的、相关的值,可以确定差异、量化错误率并确定减轻因素。

结果

我们从 MEDLINE 摘要中提取了统计比率(危险比、优势比、相对风险)、它们的 95%置信区间(CI)和 P 值。我们使用报告的 CI 重新计算了比率和 P 值。为了比较,我们还提取了最简单计算任务之一的百分比比率对。发现并分析了超过 486000 个已发表的数值差异,允许舍入和有效数字。每个报告的项目中,百分比比率计算的差异(2.7%)比比率-CI 和 P 值计算的差异(5.6-7.5%)更不频繁,小差异比大差异更频繁。对于更复杂的任务(14.3%),系统差异(相同类型的多个不正确计算)比简单任务(6.7%)更高。差异率随着期刊影响因子(JIF)和作者数量的增加而降低,但回报递减,JIF 占大部分影响。在提取的 81937 个 P 值中,约有 87%的值≤0.05。

结论

使用一种简单但准确的方法在文本内识别配对值,我们首次对这些类型的计算中发表的错误频率进行了定量评估。

联系方式

jonathan-wren@omrf.org 或 jdwren@gmail.com。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Algorithmic identification of discrepancies between published ratios and their reported confidence intervals and P-values.算法识别发表的比值与其报告的置信区间和 P 值之间的差异。
Bioinformatics. 2018 May 15;34(10):1758-1766. doi: 10.1093/bioinformatics/btx811.
2
3
Evolution of Reporting P Values in the Biomedical Literature, 1990-2015.1990 年至 2015 年生物医学文献中报告 P 值的演变。
JAMA. 2016 Mar 15;315(11):1141-8. doi: 10.1001/jama.2016.1952.
4
5
Response to letter to the editor from Dr Rahman Shiri: The challenging topic of suicide across occupational groups.回复拉赫曼·希里博士的来信:职业群体中的自杀这一具有挑战性的话题。
Scand J Work Environ Health. 2018 Jan 1;44(1):108-110. doi: 10.5271/sjweh.3698. Epub 2017 Dec 8.
6
Bioinformatics programs are 31-fold over-represented among the highest impact scientific papers of the past two decades.在过去二十年中最具影响力的科学论文中,生物信息学程序的出现频率高出31倍。
Bioinformatics. 2016 Sep 1;32(17):2686-91. doi: 10.1093/bioinformatics/btw284. Epub 2016 May 5.
7
Scientific basis of the OCRA method for risk assessment of biomechanical overload of upper limb, as preferred method in ISO standards on biomechanical risk factors.OCRA 方法评估上肢生物力学过载风险的科学基础,作为 ISO 生物力学风险因素标准中的首选方法。
Scand J Work Environ Health. 2018 Jul 1;44(4):436-438. doi: 10.5271/sjweh.3746.
8
9
10

引用本文的文献

1
Comparing the prevalence of statistical reporting inconsistencies in COVID-19 preprints and matched controls: a registered report.比较COVID-19预印本和匹配对照中统计报告不一致的发生率:一项注册报告。
R Soc Open Sci. 2023 Aug 16;10(8):202326. doi: 10.1098/rsos.202326. eCollection 2023 Aug.
2
Misinformation: an empirical study with scientists and communicators during the COVID-19 pandemic.错误信息:一项在新冠疫情期间针对科学家和传播者的实证研究。
BMJ Open Sci. 2021 Nov 25;5(1):e100188. doi: 10.1136/bmjos-2021-100188. eCollection 2021.
3
Ten simple rules on writing clean and reliable open-source scientific software.关于编写干净可靠的开源科学软件的十则简单规则。
PLoS Comput Biol. 2021 Nov 11;17(11):e1009481. doi: 10.1371/journal.pcbi.1009481. eCollection 2021 Nov.
4
Improving open and rigorous science: ten key future research opportunities related to rigor, reproducibility, and transparency in scientific research.提高开放和严谨科学:与科学研究严谨性、可重复性和透明性相关的十个关键未来研究机会。
F1000Res. 2020 Oct 14;9:1235. doi: 10.12688/f1000research.26594.1. eCollection 2020.
5
Stimulus Onset Hub: an Open-Source, Low Latency, and Opto-Isolated Trigger Box for Neuroscientific Research Replicability and Beyond.刺激起始中枢:一款用于神经科学研究可重复性及其他方面的开源、低延迟且光电隔离触发盒。
Front Neuroinform. 2020 Feb 6;14:2. doi: 10.3389/fninf.2020.00002. eCollection 2020.
6
Examination of CIs in health and medical journals from 1976 to 2019: an observational study.1976 年至 2019 年健康和医学期刊中置信区间的考察:一项观察性研究。
BMJ Open. 2019 Nov 21;9(11):e032506. doi: 10.1136/bmjopen-2019-032506.
7
Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool.生物医学研究出版物中核苷酸序列试剂的半自动事实核查:Seek & Blastn 工具。
PLoS One. 2019 Mar 1;14(3):e0213266. doi: 10.1371/journal.pone.0213266. eCollection 2019.
8
The Need for Greater Rigor in Childhood Nutrition and Obesity Research.儿童营养与肥胖研究需要更高的严谨性。
JAMA Pediatr. 2019 Apr 1;173(4):311-312. doi: 10.1001/jamapediatrics.2019.0015.
9
The Possibility of Systematic Research Fraud Targeting Under-Studied Human Genes: Causes, Consequences, and Potential Solutions.针对研究不足的人类基因进行系统性研究欺诈的可能性:原因、后果及潜在解决方案
Biomark Insights. 2019 Feb 5;14:1177271919829162. doi: 10.1177/1177271919829162. eCollection 2019.
10
Algorithmically outsourcing the detection of statistical errors and other problems.通过算法将统计错误和其他问题的检测外包出去。
EMBO J. 2018 Jun 15;37(12). doi: 10.15252/embj.201899651. Epub 2018 May 24.

本文引用的文献

1
Clinical trial IDs need to be validated prior to publication because hundreds of invalid National Clinical Trial Identifications are regularly entering MEDLINE.
Clin Trials. 2017 Feb;14(1):109. doi: 10.1177/1740774516669505. Epub 2016 Sep 22.
2
The prevalence of statistical reporting errors in psychology (1985-2013).心理学中统计报告错误的发生率(1985 - 2013年)
Behav Res Methods. 2016 Dec;48(4):1205-1226. doi: 10.3758/s13428-015-0664-2.
3
Reproducibility in science: improving the standard for basic and preclinical research.科学可重复性:提高基础和临床前研究的标准。
Circ Res. 2015 Jan 2;116(1):116-26. doi: 10.1161/CIRCRESAHA.114.303819.
4
Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science.《心理科学中的统计报告错误与统计分析合作》
PLoS One. 2014 Dec 10;9(12):e114876. doi: 10.1371/journal.pone.0114876. eCollection 2014.
5
Trends in the production of scientific data analysis resources.科学数据分析资源的生产趋势。
BMC Bioinformatics. 2014;15 Suppl 11(Suppl 11):S7. doi: 10.1186/1471-2105-15-S11-S7. Epub 2014 Oct 21.
6
Policy: NIH plans to enhance reproducibility.政策:NIH 计划提高可重复性。
Nature. 2014 Jan 30;505(7485):612-3. doi: 10.1038/505612a.
7
Believe it or not: how much can we rely on published data on potential drug targets?信不信由你:我们能在多大程度上依赖已发表的关于潜在药物靶点的数据?
Nat Rev Drug Discov. 2011 Aug 31;10(9):712. doi: 10.1038/nrd3439-c1.
8
Retracted science and the retraction index.撤稿科学与撤稿指数。
Infect Immun. 2011 Oct;79(10):3855-9. doi: 10.1128/IAI.05661-11. Epub 2011 Aug 8.
9
Accuracy of references in five biomedical informatics journals.五本生物医学信息学期刊中参考文献的准确性。
J Am Med Inform Assoc. 2005 Mar-Apr;12(2):225-8. doi: 10.1197/jamia.M1683. Epub 2004 Nov 23.
10
404 not found: the stability and persistence of URLs published in MEDLINE.404未找到:发表在MEDLINE上的网址的稳定性和持久性。
Bioinformatics. 2004 Mar 22;20(5):668-72. doi: 10.1093/bioinformatics/btg465. Epub 2004 Jan 22.