• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

我们为何以及如何应该从显著性检验转向估计。

Why and how we should join the shift from significance testing to estimation.

机构信息

Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland.

出版信息

J Evol Biol. 2022 Jun;35(6):777-787. doi: 10.1111/jeb.14009. Epub 2022 May 18.

DOI:10.1111/jeb.14009
PMID:35582935
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9322409/
Abstract

A paradigm shift away from null hypothesis significance testing seems in progress. Based on simulations, we illustrate some of the underlying motivations. First, p-values vary strongly from study to study, hence dichotomous inference using significance thresholds is usually unjustified. Second, 'statistically significant' results have overestimated effect sizes, a bias declining with increasing statistical power. Third, 'statistically non-significant' results have underestimated effect sizes, and this bias gets stronger with higher statistical power. Fourth, the tested statistical hypotheses usually lack biological justification and are often uninformative. Despite these problems, a screen of 48 papers from the 2020 volume of the Journal of Evolutionary Biology exemplifies that significance testing is still used almost universally in evolutionary biology. All screened studies tested default null hypotheses of zero effect with the default significance threshold of p = 0.05, none presented a pre-specified alternative hypothesis, pre-study power calculation and the probability of 'false negatives' (beta error rate). The results sections of the papers presented 49 significance tests on average (median 23, range 0-390). Of 41 studies that contained verbal descriptions of a 'statistically non-significant' result, 26 (63%) falsely claimed the absence of an effect. We conclude that studies in ecology and evolutionary biology are mostly exploratory and descriptive. We should thus shift from claiming to 'test' specific hypotheses statistically to describing and discussing many hypotheses (possible true effect sizes) that are most compatible with our data, given our statistical model. We already have the means for doing so, because we routinely present compatibility ('confidence') intervals covering these hypotheses.

摘要

从假设检验的零假设范式转移似乎正在进行中。基于模拟,我们说明了一些潜在的动机。首先,p 值在研究之间变化很大,因此使用显著阈值进行二分推理通常是不合理的。其次,“统计上显著”的结果高估了效应大小,随着统计效力的增加,这种偏差会减小。第三,“统计上不显著”的结果低估了效应大小,并且这种偏差随着统计效力的增加而增强。第四,所测试的统计假设通常缺乏生物学依据,并且通常没有信息。尽管存在这些问题,但对《进化生物学杂志》2020 卷的 48 篇论文进行的筛选表明,显著性检验在进化生物学中仍然几乎普遍使用。所有筛选的研究都用默认的零假设和默认的显著性阈值 p = 0.05 测试了默认的零假设,没有一个提出了预先指定的替代假设、预研究的功效计算和“假阴性”(β错误率)的概率。论文的结果部分平均提出了 49 个显著性检验(中位数为 23,范围为 0-390)。在包含“统计上不显著”结果的口头描述的 41 项研究中,有 26 项(63%)错误地声称没有效果。我们的结论是,生态学和进化生物学的研究大多是探索性和描述性的。因此,我们应该从声称对特定假设进行统计检验转变为描述和讨论与我们的数据最兼容的许多假设(可能的真实效应大小),鉴于我们的统计模型。我们已经有了这样做的手段,因为我们通常会提出涵盖这些假设的兼容性(“置信度”)区间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb28/9322409/862a8d5df0fa/JEB-35-777-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb28/9322409/ae00d86378ea/JEB-35-777-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb28/9322409/862a8d5df0fa/JEB-35-777-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb28/9322409/ae00d86378ea/JEB-35-777-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb28/9322409/862a8d5df0fa/JEB-35-777-g001.jpg

相似文献

1
Why and how we should join the shift from significance testing to estimation.我们为何以及如何应该从显著性检验转向估计。
J Evol Biol. 2022 Jun;35(6):777-787. doi: 10.1111/jeb.14009. Epub 2022 May 18.
2
The earth is flat ( > 0.05): significance thresholds and the crisis of unreplicable research.地球是平的(p>0.05):显著性阈值与不可重复研究的危机。
PeerJ. 2017 Jul 7;5:e3544. doi: 10.7717/peerj.3544. eCollection 2017.
3
Consequences of relying on statistical significance: Some illustrations.依赖统计显著性的后果:一些例证。
Eur J Clin Invest. 2018 May;48(5):e12912. doi: 10.1111/eci.12912. Epub 2018 Feb 28.
4
Trials with 'non-significant' results are not insignificant trials: a common significance threshold distorts reporting and interpretation of trial results.“无显著结果”的试验并非无意义的试验:常用的显著性阈值扭曲了试验结果的报告和解释。
Br J Anaesth. 2022 Nov;129(5):643-646. doi: 10.1016/j.bja.2022.06.023. Epub 2022 Jul 22.
5
Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise.辅助统计科学的语义和认知工具:用兼容性和惊奇取代置信度和显著性。
BMC Med Res Methodol. 2020 Sep 30;20(1):244. doi: 10.1186/s12874-020-01105-9.
6
P value and the theory of hypothesis testing: an explanation for new researchers.P 值与假设检验理论:对新研究人员的解释。
Clin Orthop Relat Res. 2010 Mar;468(3):885-92. doi: 10.1007/s11999-009-1164-4.
7
P > .05: The incorrect interpretation of "not significant" results is a significant problem.P > .05:对“不显著”结果的错误解释是一个严重的问题。
Am J Phys Anthropol. 2020 Aug;172(4):521-527. doi: 10.1002/ajpa.24092. Epub 2020 Jun 22.
8
The continuing misuse of null hypothesis significance testing in biological anthropology.生物人类学中持续存在的对零假设显著性检验的误用。
Am J Phys Anthropol. 2018 May;166(1):236-245. doi: 10.1002/ajpa.23399. Epub 2018 Jan 18.
9
Formulating appropriate statistical hypotheses for treatment comparison in clinical trial design and analysis.在临床试验设计与分析中为治疗比较制定恰当的统计假设。
Contemp Clin Trials. 2014 Nov;39(2):294-302. doi: 10.1016/j.cct.2014.09.005. Epub 2014 Oct 13.
10
Statistics in ophthalmology revisited: the (effect) size matters.眼科统计学再探:(效应)大小很重要。
Acta Ophthalmol. 2018 Nov;96(7):e885-e888. doi: 10.1111/aos.13756. Epub 2018 Sep 5.

引用本文的文献

1
Alternative to the statistical mass confusion of testing for "no effect".替代对“无效应”进行检验时的统计大规模混淆。
J Cell Biol. 2025 Aug 4;224(8). doi: 10.1083/jcb.202403034. Epub 2025 Jul 23.
2
Upgrading Esterified Pyrolysis Bio-Oil through a Two-Step Process and Assessing the Performance and Emissions of Diesel-Biodiesel-Transesterified Pyrolysis Bio-Oil Blends in Diesel Engines.通过两步法升级酯化热解生物油并评估柴油-生物柴油-酯交换热解生物油混合燃料在柴油发动机中的性能和排放
ACS Omega. 2025 Apr 20;10(16):16481-16496. doi: 10.1021/acsomega.4c11064. eCollection 2025 Apr 29.
3
Alternative to the statistical mass confusion of testing for "no effect".

本文引用的文献

1
The evidence contained in the P-value is context dependent.P值中包含的证据取决于上下文。
Trends Ecol Evol. 2022 Jul;37(7):569-570. doi: 10.1016/j.tree.2022.02.011. Epub 2022 Mar 21.
2
Rewriting results in the language of compatibility.以兼容性语言进行重写会产生结果。
Trends Ecol Evol. 2022 Jul;37(7):567-568. doi: 10.1016/j.tree.2022.02.001. Epub 2022 Feb 25.
3
Investigating the replicability of preclinical cancer biology.探究癌症生物学的临床前可重复性。
用于检验“无效应”的统计学大规模混淆的替代方法。
ArXiv. 2025 May 12:arXiv:2407.07114v3.
4
Sex differences in romantic love: an evolutionary perspective.浪漫爱情中的性别差异:进化视角
Biol Sex Differ. 2025 Feb 24;16(1):16. doi: 10.1186/s13293-025-00698-4.
5
Changes in heart rate variability at rest and during exercise in patients after a stroke: a feasibility study.中风后患者静息和运动期间心率变异性的变化:一项可行性研究。
Biomed Eng Online. 2024 Dec 26;23(1):132. doi: 10.1186/s12938-024-01328-7.
6
Tarantula welfare may be improved with greater environmental complexity: A preliminary behavioral study with Brazilian black tarantulas (Grammastola pulchra).增加环境复杂性可能改善狼蛛的健康状况:一项针对巴西黑狼蛛(Grammastola pulchra)的初步行为学研究
PLoS One. 2024 Dec 5;19(12):e0314501. doi: 10.1371/journal.pone.0314501. eCollection 2024.
7
Seasonally variable thermal performance curves prevent adverse effects of heatwaves.季节性变化的热性能曲线可防止热浪的不利影响。
J Anim Ecol. 2025 Aug;94(8):1542-1552. doi: 10.1111/1365-2656.14221. Epub 2024 Nov 11.
8
Loss of Sunda clouded leopards and forest integrity drive potential impacts of mesopredator release on vulnerable avifauna.巽他云豹的消失和森林完整性的破坏导致了中层食肉动物释放对脆弱鸟类可能产生的影响。
Heliyon. 2024 Jun 11;10(12):e32801. doi: 10.1016/j.heliyon.2024.e32801. eCollection 2024 Jun 30.
9
Quantitative MRI at 7-Tesla reveals novel frontocortical myeloarchitecture anomalies in major depressive disorder.7T 磁共振定量成像揭示重度抑郁症患者额皮质髓鞘结构的新异常。
Transl Psychiatry. 2024 Jun 20;14(1):262. doi: 10.1038/s41398-024-02976-y.
10
Empathy and Coping Strategies Predict Quality of Life in Japanese Healthcare Professionals.同理心和应对策略可预测日本医疗保健专业人员的生活质量。
Behav Sci (Basel). 2024 May 11;14(5):400. doi: 10.3390/bs14050400.
Elife. 2021 Dec 7;10:e71601. doi: 10.7554/eLife.71601.
4
Use of Confidence Intervals in Interpreting Nonstatistically Significant Results.在解释无统计学显著性结果时使用置信区间。
JAMA. 2021 Nov 23;326(20):2068-2069. doi: 10.1001/jama.2021.16172.
5
Rewriting results sections in the language of evidence.改写结果部分的证据语言。
Trends Ecol Evol. 2022 Mar;37(3):203-210. doi: 10.1016/j.tree.2021.10.009. Epub 2021 Nov 16.
6
The lesson of ivermectin: meta-analyses based on summary data alone are inherently unreliable.伊维菌素的教训:仅基于汇总数据的荟萃分析本质上是不可靠的。
Nat Med. 2021 Nov;27(11):1853-1854. doi: 10.1038/s41591-021-01535-y.
7
Reproducibility: expect less of the scientific paper.可重复性:对科学论文的期望要降低。
Nature. 2021 Sep;597(7876):329-331. doi: 10.1038/d41586-021-02486-7.
8
Opinion: A better approach for dealing with reproducibility and replicability in science.观点:一种处理科学中可重复性和可再现性的更好方法。
Proc Natl Acad Sci U S A. 2021 Feb 16;118(7). doi: 10.1073/pnas.2100769118.
9
Analysis goals, error-cost sensitivity, and analysis hacking: Essential considerations in hypothesis testing and multiple comparisons.分析目标、误差成本敏感性与分析操纵:假设检验和多重比较中的重要考量因素
Paediatr Perinat Epidemiol. 2021 Jan;35(1):8-23. doi: 10.1111/ppe.12711. Epub 2020 Dec 2.
10
Sensory pollutants alter bird phenology and fitness across a continent.感官污染物会改变鸟类的物候和适应能力,这种影响跨越整个大陆。
Nature. 2020 Nov;587(7835):605-609. doi: 10.1038/s41586-020-2903-7. Epub 2020 Nov 11.