评估在使用多重填补时用于评估检验统计显著性的中位数法。

Evaluating the median -value method for assessing the statistical significance of tests when using multiple imputation.

作者信息

Austin Peter C, Eekhout Iris, van Buuren Stef

机构信息

ICES, Toronto, Canada.

Institute of Health Policy, Management and Evaluation, University of Toronto, Canada.

出版信息

J Appl Stat. 2024 Oct 25;52(6):1161-1176. doi: 10.1080/02664763.2024.2418473. eCollection 2025.

DOI:10.1080/02664763.2024.2418473

PMID:40303568

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12035927/

Abstract

Rubin's Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin's Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student's t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median -value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated -values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student's t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson's and Spearman's correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median -value method should not be used to assess statistical significance across imputed datasets.

摘要

鲁宾法则常用于在使用多重插补时汇总各插补样本的统计分析结果。当插补数据集中的分析结果不是一个统计量及其相关标准误，而是一个检验统计量（如学生t检验）时，鲁宾法则就不能使用。虽然已经提出了用于汇总各插补样本检验统计量的复杂方法，但这些方法在许多流行的统计软件包中并未实现。有人提出了中位数法来汇总检验统计量。汇总后的检验统计量的统计显著性水平是各插补样本相关p值的中位数。我们用九种统计检验评估了该方法的性能：学生t检验、威尔科克森秩和检验、方差分析、克鲁斯卡尔-沃利斯检验、皮尔逊和斯皮尔曼相关系数的显著性检验、卡方检验、线性回归和逻辑回归中回归系数的显著性检验。对于每种检验，实际的I型错误率都高于公布的比率。随着缺失数据患病率的增加，膨胀幅度也会增大。中位数法不应被用于评估各插补数据集之间的统计显著性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估在使用多重填补时用于评估检验统计显著性的中位数法。

Evaluating the median -value method for assessing the statistical significance of tests when using multiple imputation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

评估在使用多重填补时用于评估检验统计显著性的中位数法。

Evaluating the median -value method for assessing the statistical significance of tests when using multiple imputation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献