小而美：为小样本量设计辩护。

Small is beautiful: In defense of the small-N design.

机构信息

The University of Melbourne, Melbourne, Australia.

出版信息

Psychon Bull Rev. 2018 Dec;25(6):2083-2101. doi: 10.3758/s13423-018-1451-8.

DOI:10.3758/s13423-018-1451-8

PMID:29557067

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6267527/

Abstract

The dominant paradigm for inference in psychology is a null-hypothesis significance testing one. Recently, the foundations of this paradigm have been shaken by several notable replication failures. One recommendation to remedy the replication crisis is to collect larger samples of participants. We argue that this recommendation misses a critical point, which is that increasing sample size will not remedy psychology's lack of strong measurement, lack of strong theories and models, and lack of effective experimental control over error variance. In contrast, there is a long history of research in psychology employing small-N designs that treats the individual participant as the replication unit, which addresses each of these failings, and which produces results that are robust and readily replicated. We illustrate the properties of small-N and large-N designs using a simulated paradigm investigating the stage structure of response times. Our simulations highlight the high power and inferential validity of the small-N design, in contrast to the lower power and inferential indeterminacy of the large-N design. We argue that, if psychology is to be a mature quantitative science, then its primary theoretical aim should be to investigate systematic, functional relationships as they are manifested at the individual participant level and that, wherever possible, it should use methods that are optimized to identify relationships of this kind.

摘要

心理学中推理的主导范式是零假设显著性检验。最近，这一范式的基础受到了一些显著的复制失败的动摇。有一个建议是补救复制危机，即收集更多的参与者样本。我们认为，这一建议忽略了一个关键点，即增加样本量并不能补救心理学中缺乏强有力的测量、缺乏强有力的理论和模型以及缺乏对误差方差的有效实验控制。相比之下，心理学中有很长的历史采用小 N 设计来研究个体参与者作为复制单元，这可以解决这些缺陷，并产生稳健且易于复制的结果。我们使用一个模拟范式来研究反应时的阶段结构，来说明小 N 和大 N 设计的特性。我们的模拟突出了小 N 设计的高功效和推断有效性，而大 N 设计的功效和推断不确定性较低。我们认为，如果心理学要成为一个成熟的定量科学，那么它的主要理论目标应该是研究系统的、功能的关系，这些关系是在个体参与者层面上表现出来的，而且，只要有可能，它应该使用优化的方法来识别这种关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3542/6267527/f2d4f2202974/13423_2018_1451_Fig1_HTML.jpg

相似文献

Small is beautiful: In defense of the small-N design.

Psychon Bull Rev. 2018 Dec;25(6):2083-2101. doi: 10.3758/s13423-018-1451-8.

Is psychology suffering from a replication crisis? What does "failure to replicate" really mean?

Am Psychol. 2015 Sep;70(6):487-98. doi: 10.1037/a0039400.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Bayesian inference of population prevalence.

Elife. 2021 Oct 6;10:e62461. doi: 10.7554/eLife.62461.

The generalizability crisis.

Behav Brain Sci. 2020 Dec 21;45:e1. doi: 10.1017/S0140525X20001685.

Historically recontextualizing Sidman's Tactics: How behavior analysis avoided psychology's methodological Ouroboros.

J Exp Anal Behav. 2021 Jan;115(1):115-128. doi: 10.1002/jeab.661. Epub 2020 Dec 17.

A Bayesian Perspective on the Reproducibility Project: Psychology.

PLoS One. 2016 Feb 26;11(2):e0149794. doi: 10.1371/journal.pone.0149794. eCollection 2016.

Comparing the accuracy of experimental estimates to guessing: a new perspective on replication and the "Crisis of Confidence" in psychology.

Behav Res Methods. 2014 Mar;46(1):1-14. doi: 10.3758/s13428-013-0342-1.

Rating scales institutionalise a network of logical errors and conceptual problems in research practices: A rigorous analysis showing ways to tackle psychology's crises.

Front Psychol. 2022 Dec 28;13:1009893. doi: 10.3389/fpsyg.2022.1009893. eCollection 2022.

引用本文的文献

Balance of power: The choice between trial and participant numbers to optimise the detection of phase-dependent effects.

Imaging Neurosci (Camb). 2024 Nov 5;2. doi: 10.1162/imag_a_00345. eCollection 2024.

High-resolution 7T fMRI reveals the visual zone of the human claustrum.

Imaging Neurosci (Camb). 2024 Oct 24;2. doi: 10.1162/imag_a_00327. eCollection 2024.

Urgency overpowers cognitive control by amplifying cognitive processing asymmetries.

Atten Percept Psychophys. 2025 Aug;87(6):1974-1993. doi: 10.3758/s13414-025-03102-w. Epub 2025 Jul 13.

How strong is the rhythm of perception? A registered replication of Hickok . (2015).

R Soc Open Sci. 2025 Jun 11;12(6):220497. doi: 10.1098/rsos.220497. eCollection 2025 Jun.

Urgency enforces stimulus-driven action across spatial and numerical cognitive control tasks.

PLoS One. 2025 May 2;20(5):e0322482. doi: 10.1371/journal.pone.0322482. eCollection 2025.

The Effects of Visual Behavior and Ego-Movement on Foveated Rendering Performance in Virtual Reality.

SN Comput Sci. 2025;6(4):386. doi: 10.1007/s42979-025-03885-7. Epub 2025 Apr 15.

Detection of motor-related mu rhythm desynchronization by ear EEG.

PLoS One. 2025 Apr 8;20(4):e0321107. doi: 10.1371/journal.pone.0321107. eCollection 2025.

Sample size matters when estimating test-retest reliability of behaviour.

Behav Res Methods. 2025 Mar 21;57(4):123. doi: 10.3758/s13428-025-02599-1.

Spatiotemporal survival analysis for movement trajectory tracking in virtual reality.

Sci Rep. 2025 Mar 1;15(1):7313. doi: 10.1038/s41598-025-91471-5.

Attention Rhythmically Shapes Sensory Tuning.

J Neurosci. 2025 Feb 12;45(7):e1616242024. doi: 10.1523/JNEUROSCI.1616-24.2024.

本文引用的文献

Replication is already mainstream: Lessons from small-N designs.

Behav Brain Sci. 2018 Jan;41:e141. doi: 10.1017/S0140525X18000766.

Redefine statistical significance.

Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.

Metastudies for robust tests of theory.

Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2607-2612. doi: 10.1073/pnas.1708285114. Epub 2018 Mar 12.

Four Bad Habits of Modern Psychologists.

Behav Sci (Basel). 2017 Aug 14;7(3):53. doi: 10.3390/bs7030053.

Sequence-sensitive exemplar and decision-bound accounts of speeded-classification performance in a modified Garner-tasks paradigm.

Cogn Psychol. 2016 Sep;89:1-38. doi: 10.1016/j.cogpsych.2016.07.001. Epub 2016 Jul 26.

Less Is More: Psychologists Can Learn More by Studying Fewer People.

Front Psychol. 2016 Jun 17;7:934. doi: 10.3389/fpsyg.2016.00934. eCollection 2016.

The appropriacy of averaging in the study of context effects.

Psychon Bull Rev. 2016 Oct;23(5):1639-1646. doi: 10.3758/s13423-016-1032-7.

Diffusion theory of decision making in continuous report.

Psychol Rev. 2016 Jul;123(4):425-51. doi: 10.1037/rev0000023. Epub 2016 Mar 7.

Comment on "Estimating the reproducibility of psychological science".

Science. 2016 Mar 4;351(6277):1037. doi: 10.1126/science.aad7243.

PSYCHOLOGY. Estimating the reproducibility of psychological science.

Science. 2015 Aug 28;349(6251):aac4716. doi: 10.1126/science.aac4716.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

小而美：为小样本量设计辩护。

Small is beautiful: In defense of the small-N design.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献