意义、误差、功效和样本量：统计学中的拦阻与擒抱。

Significance, Errors, Power, and Sample Size: The Blocking and Tackling of Statistics.

机构信息

From the Departments of Quantitative Health Sciences and Outcomes Research, Cleveland Clinic, Cleveland, Ohio.

Department of Surgery and Perioperative Care, Dell Medical School at the University of Texas at Austin, Austin, Texas.

出版信息

Anesth Analg. 2018 Feb;126(2):691-698. doi: 10.1213/ANE.0000000000002741.

DOI:10.1213/ANE.0000000000002741

PMID:29346210

Abstract

Inferential statistics relies heavily on the central limit theorem and the related law of large numbers. According to the central limit theorem, regardless of the distribution of the source population, a sample estimate of that population will have a normal distribution, but only if the sample is large enough. The related law of large numbers holds that the central limit theorem is valid as random samples become large enough, usually defined as an n ≥ 30. In research-related hypothesis testing, the term "statistically significant" is used to describe when an observed difference or association has met a certain threshold. This significance threshold or cut-point is denoted as alpha (α) and is typically set at .05. When the observed P value is less than α, one rejects the null hypothesis (Ho) and accepts the alternative. Clinical significance is even more important than statistical significance, so treatment effect estimates and confidence intervals should be regularly reported. A type I error occurs when the Ho of no difference or no association is rejected, when in fact the Ho is true. A type II error occurs when the Ho is not rejected, when in fact there is a true population effect. Power is the probability of detecting a true difference, effect, or association if it truly exists. Sample size justification and power analysis are key elements of a study design. Ethical concerns arise when studies are poorly planned or underpowered. When calculating sample size for comparing groups, 4 quantities are needed: α, type II error, the difference or effect of interest, and the estimated variability of the outcome variable. Sample size increases for increasing variability and power, and for decreasing α and decreasing difference to detect. Sample size for a given relative reduction in proportions depends heavily on the proportion in the control group itself, and increases as the proportion decreases. Sample size for single-group studies estimating an unknown parameter is based on the desired precision of the estimate. Interim analyses assessing for efficacy and/or futility are great tools to save time and money, as well as allow science to progress faster, but are only 1 component considered when a decision to stop or continue a trial is made.

摘要

推断统计学主要依赖中心极限定理和相关的大数定律。根据中心极限定理，无论总体分布如何，只要样本足够大，总体的样本估计值将呈正态分布。相关的大数定律规定，随着随机样本的增大，中心极限定理是有效的，通常定义为 n≥30。在与研究相关的假设检验中，“统计学上显著”一词用于描述观察到的差异或关联是否达到了某个阈值。这个显著阈值或切点表示为α（alpha），通常设定为 0.05。当观察到的 P 值小于 α 时，就拒绝零假设（Ho）并接受备择假设。临床意义比统计学意义更重要，因此应定期报告治疗效果估计值和置信区间。当没有差异或没有关联的零假设被拒绝，但实际上零假设是正确的时，就会出现一类错误。当零假设没有被拒绝，但实际上存在真实的总体效应时，就会出现二类错误。功效是在真实存在差异、效应或关联时检测到真实差异、效应或关联的概率。样本量论证和功效分析是研究设计的关键要素。当研究计划不当或功效不足时，就会出现伦理问题。在比较组时计算样本量，需要 4 个量：α、二类错误、感兴趣的差异或效应，以及结果变量的估计变异性。样本量随变异性和功效的增加而增加，随 α 和要检测的差异的减小而增加。对于给定的比例相对减少，样本量主要取决于对照组本身的比例，并随着比例的降低而增加。对于估计未知参数的单组研究，样本量基于估计的精度。评估疗效和/或无效性的中期分析是节省时间和金钱的好工具，也可以使科学更快地发展，但在决定停止或继续试验时，只是考虑的一个组成部分。

相似文献

Significance, Errors, Power, and Sample Size: The Blocking and Tackling of Statistics.

Anesth Analg. 2018 Feb;126(2):691-698. doi: 10.1213/ANE.0000000000002741.

Futility interim monitoring with control of type I and II error probabilities using the interim Z-value or confidence limit.

Clin Trials. 2009 Dec;6(6):565-73. doi: 10.1177/1740774509350327. Epub 2009 Nov 23.

Statistical Power in Plant Pathology Research.

Phytopathology. 2018 Jan;108(1):15-22. doi: 10.1094/PHYTO-03-17-0098-LE. Epub 2017 Oct 30.

The reassessment of trial perspectives from interim data--a critical view.

Stat Med. 2006 Jan 15;25(1):23-36. doi: 10.1002/sim.2180.

Inappropriate use of statistical power.

Bone Marrow Transplant. 2023 May;58(5):474-477. doi: 10.1038/s41409-023-01935-3. Epub 2023 Mar 3.

[Principles of tests of hypotheses in statistics: alpha, beta and P].

Ann Fr Anesth Reanim. 1998;17(9):1168-80. doi: 10.1016/s0750-7658(00)80015-5.

Operating characteristics of sample size re-estimation with futility stopping based on conditional power.

Stat Med. 2006 Oct 15;25(19):3348-65. doi: 10.1002/sim.2455.

On sample size estimation and re-estimation adjusting for variability in confirmatory trials.

J Biopharm Stat. 2016;26(1):44-54. doi: 10.1080/10543406.2015.1092031.

Concepts in sample size determination.

Indian J Dent Res. 2012 Sep-Oct;23(5):660-4. doi: 10.4103/0970-9290.107385.

Statistical Significance

引用本文的文献

Establishing next-generation reference intervals for pro-gastrin-releasing peptide using a dynamic modeling approach.

J Transl Med. 2025 Sep 2;23(1):983. doi: 10.1186/s12967-025-07014-z.

Implementation of Behavior Change Theories and Techniques for Physical Activity Just-in-Time Adaptive Interventions: A Scoping Review.

Int J Environ Res Public Health. 2025 Jul 17;22(7):1133. doi: 10.3390/ijerph22071133.

Placental and cerebral circulation in fetuses of mothers with polycystic ovary syndrome and the effect of Metformin exposure.

BMC Pregnancy Childbirth. 2025 Jul 10;25(1):749. doi: 10.1186/s12884-025-07866-9.

Using Speech Features and Machine Learning Models to Predict Emotional and Behavioral Problems in Chinese Adolescents.

Depress Anxiety. 2025 Jun 16;2025:5734107. doi: 10.1155/da/5734107. eCollection 2025.

The effect of acute hot water immersion on cutaneous peripheral microvascular responses in males of White-European, Black-African and South-Asian descent.

Temperature (Austin). 2025 Jan 24;12(2):149-165. doi: 10.1080/23328940.2025.2453959. eCollection 2025.

Assessment of sub-maximal aerobic capacity in North African patients with chronic hepatitis B: a pilot case-control study.

F1000Res. 2025 Apr 8;14:98. doi: 10.12688/f1000research.160390.1. eCollection 2025.

Assessment and Intervention for Diabetes Distress in Primary Care Using Clinical and Technological Interventions: Protocol for a Single-Arm Pilot Trial.

JMIR Res Protoc. 2025 Mar 31;14:e62916. doi: 10.2196/62916.

Comments on "Modulation of NRF2 and CYP24A1 Pathways by Hookah Smoke: Implications for Male Reproductive Health".

Am J Mens Health. 2025 Mar-Apr;19(2):15579883251324038. doi: 10.1177/15579883251324038. Epub 2025 Mar 18.

Mapping Organism-wide Single Cell mRNA Expression Linked to Extracellular Vesicle Biogenesis, Secretion, and Cargo.

Function (Oxf). 2025 Mar 24;6(2). doi: 10.1093/function/zqaf005.

Effects of Hook Maneuver on Oxygen Saturation Recovery After -40 m Apnea Dive-A Randomized Crossover Trial.

Sports (Basel). 2025 Jan 15;13(1):24. doi: 10.3390/sports13010024.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

意义、误差、功效和样本量：统计学中的拦阻与擒抱。

Significance, Errors, Power, and Sample Size: The Blocking and Tackling of Statistics.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献