• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估在使用多重填补时用于评估检验统计显著性的中位数法。

Evaluating the median -value method for assessing the statistical significance of tests when using multiple imputation.

作者信息

Austin Peter C, Eekhout Iris, van Buuren Stef

机构信息

ICES, Toronto, Canada.

Institute of Health Policy, Management and Evaluation, University of Toronto, Canada.

出版信息

J Appl Stat. 2024 Oct 25;52(6):1161-1176. doi: 10.1080/02664763.2024.2418473. eCollection 2025.

DOI:10.1080/02664763.2024.2418473
PMID:40303568
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12035927/
Abstract

Rubin's Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin's Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student's t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median -value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated -values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student's t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson's and Spearman's correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median -value method should not be used to assess statistical significance across imputed datasets.

摘要

鲁宾法则常用于在使用多重插补时汇总各插补样本的统计分析结果。当插补数据集中的分析结果不是一个统计量及其相关标准误,而是一个检验统计量(如学生t检验)时,鲁宾法则就不能使用。虽然已经提出了用于汇总各插补样本检验统计量的复杂方法,但这些方法在许多流行的统计软件包中并未实现。有人提出了中位数法来汇总检验统计量。汇总后的检验统计量的统计显著性水平是各插补样本相关p值的中位数。我们用九种统计检验评估了该方法的性能:学生t检验、威尔科克森秩和检验、方差分析、克鲁斯卡尔-沃利斯检验、皮尔逊和斯皮尔曼相关系数的显著性检验、卡方检验、线性回归和逻辑回归中回归系数的显著性检验。对于每种检验,实际的I型错误率都高于公布的比率。随着缺失数据患病率的增加,膨胀幅度也会增大。中位数法不应被用于评估各插补数据集之间的统计显著性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/5c6928a4fc4f/CJAS_A_2418473_F0007_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/e31cf940794d/CJAS_A_2418473_F0001_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/890578ff68aa/CJAS_A_2418473_F0002_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/a7ea3095702c/CJAS_A_2418473_F0003_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/241fc402c2d7/CJAS_A_2418473_F0004_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/e3ccfcd89713/CJAS_A_2418473_F0005_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/fea5a1b011b7/CJAS_A_2418473_F0006_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/5c6928a4fc4f/CJAS_A_2418473_F0007_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/e31cf940794d/CJAS_A_2418473_F0001_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/890578ff68aa/CJAS_A_2418473_F0002_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/a7ea3095702c/CJAS_A_2418473_F0003_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/241fc402c2d7/CJAS_A_2418473_F0004_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/e3ccfcd89713/CJAS_A_2418473_F0005_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/fea5a1b011b7/CJAS_A_2418473_F0006_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b696/12035927/5c6928a4fc4f/CJAS_A_2418473_F0007_OC.jpg

相似文献

1
Evaluating the median -value method for assessing the statistical significance of tests when using multiple imputation.评估在使用多重填补时用于评估检验统计显著性的中位数法。
J Appl Stat. 2024 Oct 25;52(6):1161-1176. doi: 10.1080/02664763.2024.2418473. eCollection 2025.
2
Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis.类别协变量在多重插补后逻辑回归模型中的显著性检验方法:功效和适用性分析。
BMC Med Res Methodol. 2017 Aug 22;17(1):129. doi: 10.1186/s12874-017-0404-7.
3
The development and validation of prognostic models for overall survival in the presence of missing data in the training dataset: a strategy with a detailed example.训练数据集中存在缺失数据时总生存预后模型的开发与验证:一个详细示例的策略
Diagn Progn Res. 2021 Aug 4;5(1):14. doi: 10.1186/s41512-021-00103-9.
4
Significance Tests and Estimates for for Multiple Regression in Multiply Imputed Datasets: A Cautionary Note on Earlier Findings, and Alternative Solutions.多重插补数据集下多元回归的检验和估计:对早期发现的警示,及替代解决方案。
Multivariate Behav Res. 2019 Jul-Aug;54(4):514-529. doi: 10.1080/00273171.2018.1540967. Epub 2019 Mar 1.
5
Propensity score analysis with partially observed covariates: How should multiple imputation be used?倾向评分分析与部分观测协变量:应如何使用多重插补?
Stat Methods Med Res. 2019 Jan;28(1):3-19. doi: 10.1177/0962280217713032. Epub 2017 Jun 2.
6
Towards a More Accurate Differential Analysis of Multiple Imputed Proteomics Data with mi4limma.mi4limma 实现更精确的多重插补蛋白质组学数据差异分析
Methods Mol Biol. 2023;2426:131-140. doi: 10.1007/978-1-0716-1967-4_7.
7
On the multiple imputation variance estimator for control-based and delta-adjusted pattern mixture models.关于基于控制和增量调整模式混合模型的多重填补方差估计器
Biometrics. 2017 Dec;73(4):1379-1387. doi: 10.1111/biom.12702. Epub 2017 Apr 13.
8
Meta-analysis of test accuracy studies using imputation for partial reporting of multiple thresholds.基于多个阈值部分报告的插补对检测准确性研究的汇总分析。
Res Synth Methods. 2018 Mar;9(1):100-115. doi: 10.1002/jrsm.1276. Epub 2017 Nov 22.
9
Multiple imputation with sequential penalized regression.多重插补与序贯惩罚回归。
Stat Methods Med Res. 2019 May;28(5):1311-1327. doi: 10.1177/0962280218755574. Epub 2018 Feb 16.
10
Variable selection for multiply-imputed data with application to dioxin exposure study.具有应用于二恶英暴露研究的多重插补数据的变量选择。
Stat Med. 2013 Sep 20;32(21):3646-59. doi: 10.1002/sim.5783. Epub 2013 Mar 25.

本文引用的文献

1
A simple pooling method for variable selection in multiply imputed datasets outperformed complex methods.一种简单的池化方法在多重插补数据集的变量选择中表现优于复杂方法。
BMC Med Res Methodol. 2022 Aug 4;22(1):214. doi: 10.1186/s12874-022-01693-8.
2
Inference following multiple imputation for generalized additive models: an investigation of the median p-value rule with applications to the Pulmonary Hypertension Association Registry and Colorado COVID-19 hospitalization data.广义加性模型的多重插补后推断:中位数 p 值规则的调查及其在肺动脉高压协会登记处和科罗拉多州 COVID-19 住院数据中的应用。
BMC Med Res Methodol. 2022 May 21;22(1):148. doi: 10.1186/s12874-022-01613-w.
3
Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis.
类别协变量在多重插补后逻辑回归模型中的显著性检验方法:功效和适用性分析。
BMC Med Res Methodol. 2017 Aug 22;17(1):129. doi: 10.1186/s12874-017-0404-7.
4
Multiple imputation using chained equations: Issues and guidance for practice.使用链式方程进行多重插补:实践中的问题和指导。
Stat Med. 2011 Feb 20;30(4):377-99. doi: 10.1002/sim.4067. Epub 2010 Nov 30.
5
Multiple imputation of discrete and continuous data by fully conditional specification.通过完全条件设定对离散和连续数据进行多重填补
Stat Methods Med Res. 2007 Jun;16(3):219-42. doi: 10.1177/0962280206074463.