使用 P 值或贝叶斯替代方案确定零假设：一项模拟研究。

Deciding on Null Hypotheses using P-values or Bayesian alternatives: A simulation study.

机构信息

UCAM Universidad Católica de Murcia.

出版信息

Psicothema. 2018 Feb;30(1):110-115. doi: 10.7334/psicothema2017.308.

DOI:10.7334/psicothema2017.308

PMID:29363479

Abstract

BACKGROUND

The p-value is currently one of the key elements for testing statistical hypothesis despite its critics. Bayesian statistics and Bayes Factors have been proposed as alternatives to improve the scientific decision making when testing a hypothesis. This study compares the performance of two Bayes Factor estimations (the BIC-based Bayes Factor and the Vovk-Sellke p-value calibration) with the p-value when the null hypothesis holds.

METHOD

A million pairs of independent data sets were simulated. All simulated data came from a normal population and different sample sizes were considered. Exact p-values for comparing sample means were recorded for each sample pair as well as Bayesian alternatives.

RESULTS

Bayes factors exhibit better performance than the p-value, favouring the null hypothesis over the alternative. The BIC-based Bayes Factor is more accurate than the p-value calibration under the simulation conditions and this behaviour improves as the sample size grows.

CONCLUSIONS

Our results show that Bayesian factors are good complements for testing a hypothesis. The use of the Bayesian alternatives we have tested could help researchers avoid claiming false statistical discoveries. We suggest using classical and Bayesian statistics together instead of rejecting either of them.

摘要

背景

尽管受到批评，p 值仍是目前检验统计假设的关键要素之一。贝叶斯统计和贝叶斯因子已被提议作为替代方法，以提高检验假设时的科学决策能力。本研究比较了当零假设成立时，两种贝叶斯因子估计（基于 BIC 的贝叶斯因子和 Vovk-Sellke p 值校准）与 p 值的性能。

方法

模拟了一百万对独立数据集。所有模拟数据均来自正态总体，考虑了不同的样本大小。为每个样本对记录了用于比较样本均值的精确 p 值以及贝叶斯替代值。

结果

贝叶斯因子的表现优于 p 值，更倾向于零假设而非备择假设。在模拟条件下，基于 BIC 的贝叶斯因子比 p 值校准更准确，并且随着样本量的增加，这种行为会得到改善。

结论

我们的结果表明，贝叶斯因子是检验假设的良好补充。使用我们测试过的贝叶斯替代方法可以帮助研究人员避免声称虚假的统计发现。我们建议将经典统计学和贝叶斯统计学结合使用，而不是排斥其中任何一种。

相似文献

Deciding on Null Hypotheses using P-values or Bayesian alternatives: A simulation study.

Psicothema. 2018 Feb;30(1):110-115. doi: 10.7334/psicothema2017.308.

Bayesian evaluation of informative hypotheses in cluster-randomized trials.

Behav Res Methods. 2019 Feb;51(1):126-137. doi: 10.3758/s13428-018-1149-x.

Significance test for linear regression: how to test without -values?

J Appl Stat. 2020 Mar 31;48(5):827-845. doi: 10.1080/02664763.2020.1748180. eCollection 2021.

Bayes factor approaches for testing interval null hypotheses.

Psychol Methods. 2011 Dec;16(4):406-19. doi: 10.1037/a0024377. Epub 2011 Jul 25.

Worked-out examples of the adequacy of Bayesian optional stopping.

Psychon Bull Rev. 2022 Feb;29(1):70-87. doi: 10.3758/s13423-021-01962-5. Epub 2021 Jul 12.

Prior sensitivity of null hypothesis Bayesian testing.

Psychol Methods. 2022 Oct;27(5):804-821. doi: 10.1037/met0000292. Epub 2021 Sep 27.

Bayesian Hodges-Lehmann tests for statistical equivalence in the two-sample setting: Power analysis, type I error rates and equivalence boundary selection in biomedical research.

BMC Med Res Methodol. 2021 Aug 17;21(1):171. doi: 10.1186/s12874-021-01341-7.

Unscaled Bayes factors for multiple hypothesis testing in microarray experiments.

Stat Methods Med Res. 2015 Dec;24(6):1030-43. doi: 10.1177/0962280212437827. Epub 2012 Feb 15.

To P or Not to P: Backing Bayesian Statistics.

Otolaryngol Head Neck Surg. 2017 Dec;157(6):915-918. doi: 10.1177/0194599817739260.

Bayesian alternatives for common null-hypothesis significance tests in psychiatry: a non-technical guide using JASP.

BMC Psychiatry. 2018 Jun 7;18(1):178. doi: 10.1186/s12888-018-1761-4.

引用本文的文献

Influence of the statistical significance of results and spin on readers' interpretation of the results in an abstract for a hypothetical clinical trial: a randomised trial.

BMJ Open. 2022 Apr 8;12(4):e056503. doi: 10.1136/bmjopen-2021-056503.

Validation of the Orgasm Rating Scale in Context of Sexual Relationships of Gay and Lesbian Adults.

Int J Environ Res Public Health. 2022 Jan 13;19(2):887. doi: 10.3390/ijerph19020887.

Effect of conventional transcranial direct current stimulation devices and electrode sizes on motor cortical excitability of the quadriceps muscle.

Restor Neurol Neurosci. 2021;39(5):379-391. doi: 10.3233/RNN-211210.

How Does Ankle Mechanical Stiffness Change as a Function of Muscle Activation in Standing and During the Late Stance of Walking?

IEEE Trans Biomed Eng. 2022 Mar;69(3):1186-1193. doi: 10.1109/TBME.2021.3117516. Epub 2022 Feb 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用 P 值或贝叶斯替代方案确定零假设：一项模拟研究。

Deciding on Null Hypotheses using P-values or Bayesian alternatives: A simulation study.

机构信息

出版信息

BACKGROUND

METHOD

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献