线性模型中的隐秘多重假设检验：效应量高估与胜者诅咒

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse.

作者信息

Forstmeier Wolfgang, Schielzeth Holger

出版信息

Behav Ecol Sociobiol. 2011 Jan;65(1):47-55. doi: 10.1007/s00265-010-1038-5. Epub 2010 Aug 19.

DOI:10.1007/s00265-010-1038-5

PMID:21297852

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3015194/

Abstract

Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one 'significant' effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies ('the winner's curse'). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.

摘要

拟合具有多个预测变量的广义线性模型（GLMs）已成为进化和行为研究中的标准分析方法。通常，GLMs用于探索性数据分析，即从一个包含交互项的复杂全模型开始，然后通过去除不显著的项进行简化。虽然这种方法可能有用，但如果将显著效应解释为好像它们来自单个先验假设检验，就会产生问题。这是因为模型选择涉及隐含的多重假设检验，这一事实很少得到承认或量化。我们表明，即使所有原假设都为真，发现至少一个“显著”效应的概率也很高（例如，从四个预测变量及其双向交互项开始时为40%）。当样本量（N）相对于包括交互项在内的预测变量数量（k）较大时，这个概率接近理论预期。相比之下，当将模型简化应用于简化前过度拟合的模型（低N/k比）时，I型错误率甚至大大超过这些预期。假阳性结果的增加主要源于对显著预测变量效应大小的高估，导致效应大小向上偏倚，后续研究中往往无法重现（“赢家的诅咒”）。尽管全模型检验和P值调整有自身的问题，但它们可以作为仅由抽样变异导致I型错误出现频率的指南。我们赞成展示全模型，因为它们能最好地反映所研究预测变量的范围，并确保对不显著结果也有平衡的呈现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3015194/aa326aaf963d/265_2010_1038_Fig1_HTML.jpg

相似文献

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse.线性模型中的隐秘多重假设检验：效应量高估与胜者诅咒

Behav Ecol Sociobiol. 2011 Jan;65(1):47-55. doi: 10.1007/s00265-010-1038-5. Epub 2010 Aug 19.

An empirical investigation into the impact of winner's curse on estimates from Mendelian randomization.孟德尔随机化中赢家诅咒对估计值影响的实证研究

Int J Epidemiol. 2023 Aug 2;52(4):1209-1219. doi: 10.1093/ije/dyac233.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Power, false discovery rate and Winner's Curse in eQTL studies.基因表达数量性状位点(eQTL)研究中的权力、假发现率和赢家诅咒。

Nucleic Acids Res. 2018 Dec 14;46(22):e133. doi: 10.1093/nar/gky780.

The earth is flat ( > 0.05): significance thresholds and the crisis of unreplicable research.地球是平的（p>0.05）：显著性阈值与不可重复研究的危机。

PeerJ. 2017 Jul 7;5:e3544. doi: 10.7717/peerj.3544. eCollection 2017.

Publication selection in health policy research: the winner's curse hypothesis.健康政策研究中的文献选择：赢家的诅咒假说。

Health Policy. 2013 Jan;109(1):78-87. doi: 10.1016/j.healthpol.2012.10.015. Epub 2012 Nov 20.

Accurate modeling of replication rates in genome-wide association studies by accounting for Winner's Curse and study-specific heterogeneity.通过考虑优胜者诅咒和研究特异性异质性，在全基因组关联研究中准确建模复制率。

G3 (Bethesda). 2022 Dec 1;12(12). doi: 10.1093/g3journal/jkac261.

Quantifying and correcting for the winner's curse in genetic association studies.在基因关联研究中对胜者之咒进行量化与校正。

Genet Epidemiol. 2009 Jul;33(5):453-62. doi: 10.1002/gepi.20398.

Review and further developments in statistical corrections for Winner's Curse in genetic association studies.遗传关联研究中赢家诅咒的统计校正方法的回顾与进一步发展。

PLoS Genet. 2023 Sep 18;19(9):e1010546. doi: 10.1371/journal.pgen.1010546. eCollection 2023 Sep.

A simple yet accurate correction for winner's curse can predict signals discovered in much larger genome scans.一种针对胜者之咒的简单而准确的校正方法能够预测在规模大得多的基因组扫描中发现的信号。

Bioinformatics. 2016 Sep 1;32(17):2598-603. doi: 10.1093/bioinformatics/btw303. Epub 2016 May 13.

引用本文的文献

Object-directed behaviors and human-directed sociability are linked in free-ranging dog puppies.在自由放养的幼犬中，目标导向行为与对人类的社交性是相关联的。

iScience. 2025 Jul 29;28(9):113231. doi: 10.1016/j.isci.2025.113231. eCollection 2025 Sep 19.

Traces of phylogeny and ecology in hippocampal neuron numbers.海马神经元数量中的系统发育和生态学痕迹。

PNAS Nexus. 2025 Aug 13;4(9):pgaf261. doi: 10.1093/pnasnexus/pgaf261. eCollection 2025 Sep.

Infants as Social Magnets: The Influence of Births on Social Interactions in Redfronted Lemurs (Eulemur rufifrons).婴儿作为社交磁石：出生对红额狐猴（Eulemur rufifrons）社交互动的影响。

Am J Primatol. 2025 Aug;87(8):e70067. doi: 10.1002/ajp.70067.

Relevance acquisition through motivational incentives: Modeling the time-course of associative learning and the role of visual features.通过动机激励获取相关性：关联学习的时间进程及视觉特征作用的建模

Imaging Neurosci (Camb). 2024 May 8;2. doi: 10.1162/imag_a_00162. eCollection 2024.

When to mob? plasticity of antipredator behavior in common ravens' families (Corvus corax) across offspring development.何时进行群体防御？普通渡鸦（Corvus corax）家庭中反捕食行为在后代发育过程中的可塑性

Anim Cogn. 2025 Jul 3;28(1):55. doi: 10.1007/s10071-025-01976-9.

Self-directed behavior reflects social stress in captive Asian elephants.自主行为反映了圈养亚洲象的社会压力。

Front Vet Sci. 2025 Jun 13;12:1629664. doi: 10.3389/fvets.2025.1629664. eCollection 2025.

Do dogs form reputations of humans? No effect of age after indirect and direct experience in a food-giving situation.狗会形成对人类的看法吗？在给予食物的情境中，间接和直接体验后年龄没有影响。

Anim Cogn. 2025 Jun 28;28(1):51. doi: 10.1007/s10071-025-01967-w.

Bonobos tend to behave optimistically after hearing laughter.倭黑猩猩在听到笑声后往往表现得很乐观。

Sci Rep. 2025 Jun 26;15(1):20067. doi: 10.1038/s41598-025-02594-8.

Human ostension enhances attentiveness but not performance in domestic pigs.人类的示教增强了家猪的注意力，但并未提高其表现。

Sci Rep. 2025 May 9;15(1):16161. doi: 10.1038/s41598-025-00511-7.

Rapid mimicry of trunk and head movements during play in African Savanna elephants (Loxodonta africana).非洲草原象（Loxodonta africana）在玩耍时对躯干和头部动作的快速模仿。

Sci Rep. 2025 May 9;15(1):16263. doi: 10.1038/s41598-025-01067-2.

本文引用的文献

ANALYZING TABLES OF STATISTICAL TESTS.分析统计检验表

Evolution. 1989 Jan;43(1):223-225. doi: 10.1111/j.1558-5646.1989.tb04220.x.

Conclusions beyond support: overconfident estimates in mixed models.超出支持范围的结论：混合模型中的过度自信估计。

Behav Ecol. 2009 Mar;20(2):416-420. doi: 10.1093/beheco/arn145. Epub 2008 Nov 27.

Validating, augmenting and refining genome-wide association signals.验证、增强和完善全基因组关联信号。

Nat Rev Genet. 2009 May;10(5):318-29. doi: 10.1038/nrg2544.

Stepwise model fitting and statistical inference: turning noise into signal pollution.逐步模型拟合与统计推断：将噪声转化为信号污染。

Am Nat. 2009 Jan;173(1):119-23. doi: 10.1086/593303.

Forward selection of explanatory variables.解释变量的向前选择法。

Ecology. 2008 Sep;89(9):2623-32. doi: 10.1890/07-0986.1.

Why most discovered true associations are inflated.为何大多数已发现的真实关联被夸大了。

Epidemiology. 2008 Sep;19(5):640-8. doi: 10.1097/EDE.0b013e31818131e7.

Individual recognition: it is good to be different.个体识别：与众不同是件好事。

Trends Ecol Evol. 2007 Oct;22(10):529-37. doi: 10.1016/j.tree.2007.09.001. Epub 2007 Sep 29.

Overcoming the winner's curse: estimating penetrance parameters from case-control data.克服胜者的诅咒：从病例对照数据估计外显率参数。

Am J Hum Genet. 2007 Apr;80(4):605-15. doi: 10.1086/512821. Epub 2007 Feb 16.

Inference in ecology and evolution.生态学与进化中的推断

Trends Ecol Evol. 2007 Apr;22(4):192-7. doi: 10.1016/j.tree.2006.12.003. Epub 2006 Dec 13.

Replicating empirical research in behavioral ecology: how and why it should be done but rarely ever is.在行为生态学中复制实证研究：如何进行以及为何应该进行，但却很少有人这么做。

Q Rev Biol. 2006 Sep;81(3):221-36. doi: 10.1086/506236.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

线性模型中的隐秘多重假设检验：效应量高估与胜者诅咒

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse.

作者信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献