• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

统计假设检验——精确的p值有多精确?

Statistical hypothesis testing--how exact are exact p-values?

作者信息

Gasko R

机构信息

Vzajomna zdravotna poistovna Dovera, Kosice, Slovakia.

出版信息

Bratisl Lek Listy. 2003;104(1):36-9.

PMID:12830995
Abstract

OBJECTIVES AND BACKGROUND

When testing a hypothesis statistically, a principle is generally accepted that exact p values shall be stated in the treatise. Researchers have the choice of many statistical computer programmes with implemented hypothesis tests. Are exact p values calculated in the same statistical tests by diverse statistical programmes identical?

METHODS

The respective zero hypothesis were tested in 5 artificially created data sets by the parametric unpaired t-test, non-parametric Mann-Whitney test, two-tailed F-test. The calculations were carried out by the following programmes: Statistix, version 7.1 (source www.statistix.com), Analyse-it, version 1.62 (source www.analyse-it.com), MedCalc, version 6.14 (source www.medcalc.be). The p values in the same tests were mutually compared.

RESULTS

All three programmes calculated identical exact p values for the t-test. In the remaining two tests in case of 26 out of 44 calculations (59.1 per cent; 95 per cent confidence interval 43-73 per cent) different p values were calculated. The greatest difference was 18.35 per cent. In two cases the values oscillated about 0.05 and this fact caused essentially different interpretation of results.

CONCLUSIONS

Using the significance test in the biomedical research has been subject to criticism for a longer period of time. The testing of the zero hypothesis on the arbitrary significance level of 0.05 should be substituted by other methods. Our discoveries should undermine the ungrounded belief of the users of statistical tests--physicians in ununderminable accuracy of mathematical procedures. The use of confidence intervals deems much more suitable although there are objections against them as well. (Tab. 4, Fig. 1, Ref. 19.).

摘要

目的与背景

在对假设进行统计学检验时,一般公认的原则是应在论文中陈述确切的p值。研究人员可以选择许多已实施假设检验的统计计算机程序。不同的统计程序在相同的统计检验中计算出的确切p值是否相同?

方法

通过参数非配对t检验、非参数曼-惠特尼检验、双尾F检验对5个人工创建的数据集检验各自的零假设。计算由以下程序进行:Statistix 7.1版(来源:www.statistix.com)、Analyse-it 1.62版(来源:www.analyse-it.com)、MedCalc 6.14版(来源:www.medcalc.be)。对相同检验中的p值进行相互比较。

结果

所有三个程序对t检验计算出相同的确切p值。在其余两项检验中,44次计算中有26次(59.1%;95%置信区间43 - 73%)计算出不同的p值。最大差异为18.35%。在两种情况下,值在0.05左右波动,这一事实导致对结果的解释有本质不同。

结论

在生物医学研究中使用显著性检验长期以来一直受到批评。应采用其他方法替代在任意显著性水平0.05上对零假设的检验。我们的发现应会削弱统计检验使用者(医生)对数学程序不可动摇的准确性的毫无根据的信念。使用置信区间虽然也有人反对,但似乎更为合适。(表4,图1,参考文献19)

相似文献

1
Statistical hypothesis testing--how exact are exact p-values?统计假设检验——精确的p值有多精确?
Bratisl Lek Listy. 2003;104(1):36-9.
2
[Comparison of two or more samples of quantitative data].[两个或多个定量数据样本的比较]
Acta Med Croatica. 2006;60 Suppl 1:37-46.
3
Statistics in ophthalmology revisited: the (effect) size matters.眼科统计学再探:(效应)大小很重要。
Acta Ophthalmol. 2018 Nov;96(7):e885-e888. doi: 10.1111/aos.13756. Epub 2018 Sep 5.
4
[The uncertainties of statistical "significance"].[统计“显著性”的不确定性]
Rev Med Chil. 2018 Dec;146(10):1184-1189. doi: 10.4067/S0034-98872018001001184.
5
Statistical inference by confidence intervals: issues of interpretation and utilization.基于置信区间的统计推断:解释与应用问题
Phys Ther. 1999 Feb;79(2):186-95.
6
Effect size, confidence interval and statistical significance: a practical guide for biologists.效应量、置信区间与统计显著性:生物学家实用指南
Biol Rev Camb Philos Soc. 2007 Nov;82(4):591-605. doi: 10.1111/j.1469-185X.2007.00027.x.
7
Statistics review 3: hypothesis testing and P values.统计学复习3:假设检验与P值
Crit Care. 2002 Jun;6(3):222-5. doi: 10.1186/cc1493. Epub 2002 Mar 18.
8
The use and abuse of hypothesis tests: how to present P values.假设检验的使用与滥用:如何展示P值。
Phlebology. 2010 Jun;25(3):107-12. doi: 10.1258/phleb.2010.009094.
9
Trimmed weighted Simes' test for two one-sided hypotheses with arbitrarily correlated test statistics.针对具有任意相关检验统计量的两个单侧假设的修剪加权西姆斯检验。
Biom J. 2009 Dec;51(6):885-98. doi: 10.1002/bimj.200900132.
10
Biostatistics Series Module 2: Overview of Hypothesis Testing.生物统计学系列模块2:假设检验概述。
Indian J Dermatol. 2016 Mar-Apr;61(2):137-45. doi: 10.4103/0019-5154.177775.

引用本文的文献

1
An assessment of recently published gene expression data analyses: reporting experimental design and statistical factors.对近期发表的基因表达数据分析的评估:报告实验设计和统计因素。
BMC Med Inform Decis Mak. 2006 Jun 21;6:27. doi: 10.1186/1472-6947-6-27.