广义加性模型的多重插补后推断：中位数 p 值规则的调查及其在肺动脉高压协会登记处和科罗拉多州 COVID-19 住院数据中的应用。

Inference following multiple imputation for generalized additive models: an investigation of the median p-value rule with applications to the Pulmonary Hypertension Association Registry and Colorado COVID-19 hospitalization data.

机构信息

Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado-Denver Anschutz Medical Campus, 13001 E. 17th Pl, Aurora, CO, USA.

School of Medicine, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA.

出版信息

BMC Med Res Methodol. 2022 May 21;22(1):148. doi: 10.1186/s12874-022-01613-w.

DOI:10.1186/s12874-022-01613-w

PMID:35597908

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9123297/

Abstract

BACKGROUND

Missing data prove troublesome in data analysis; at best they reduce a study's statistical power and at worst they induce bias in parameter estimates. Multiple imputation via chained equations is a popular technique for dealing with missing data. However, techniques for combining and pooling results from fitted generalized additive models (GAMs) after multiple imputation have not been well explored.

METHODS

We simulated missing data under MCAR, MAR, and MNAR frameworks and utilized random forest and predictive mean matching imputation to investigate a variety of rules for combining GAMs after multiple imputation with binary and normally distributed outcomes. We compared multiple pooling procedures including the "D2" method, the Cauchy combination test, and the median p-value (MPV) rule. The MPV rule involves simply computing and reporting the median p-value across all imputations. Other ad hoc methods such as a mean p-value rule and a single imputation method are investigated. The viability of these methods in pooling results from B-splines is also examined for normal outcomes. An application of these various pooling techniques is then performed on two case studies, one which examines the effect of elevation on a six-minute walk distance (a normal outcome) for patients with pulmonary arterial hypertension, and the other which examines risk factors for intubation in hospitalized COVID-19 patients (a dichotomous outcome).

RESULTS

In comparison to the results from generalized additive models fit on full datasets, the median p-value rule performs as well as if not better than the other methods examined. In situations where the alternative hypothesis is true, the Cauchy combination test appears overpowered and alternative methods appear underpowered, while the median p-value rule yields results similar to those from analyses of complete data.

CONCLUSIONS

For pooling results after fitting GAMs to multiply imputed datasets, the median p-value is a simple yet useful approach which balances both power to detect important associations and control of Type I errors.

摘要

背景

缺失数据在数据分析中是一个棘手的问题；最好的情况下，它们会降低研究的统计效力，最坏的情况下，它们会导致参数估计产生偏差。通过链式方程进行多重插补是处理缺失数据的一种常用技术。然而，在进行多重插补后结合和汇总拟合广义加性模型（GAMs）结果的技术尚未得到充分探索。

方法

我们在完全数据集上拟合 GAMs 后，使用中位数 p 值规则对多重插补后 GAMs 结果进行汇总，比较了不同的二元和正态分布结局的汇总方法，包括“D2”方法、柯西组合检验和中位数 p 值（MPV）规则。MPV 规则涉及到简单地计算和报告所有插补值的中位数 p 值。此外，还研究了一些特定方法，如平均 p 值规则和单一插补方法。然后，还检查了这些方法在正态分布结局的 B 样条中汇总结果的可行性。最后，我们将这些各种汇总技术应用于两个案例研究，一个研究考察了肺动脉高压患者的海拔高度对六分钟步行距离（正态分布结局）的影响，另一个研究考察了住院 COVID-19 患者插管的危险因素（二分类结局）。

结果

与在完整数据集上拟合 GAMs 的结果相比，中位数 p 值规则的表现与其他方法相当，甚至更好。在替代假设为真的情况下，柯西组合检验似乎具有优势，而替代方法则显得效力不足，而中位数 p 值规则的结果与分析完整数据的结果相似。

结论

对于拟合 GAMs 后对多重插补数据集结果进行汇总，中位数 p 值是一种简单而有用的方法，它可以平衡检测重要关联的效力和控制 I 型错误的能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6488/9123814/cb41b8b6b17c/12874_2022_1613_Fig1_HTML.jpg

相似文献

Inference following multiple imputation for generalized additive models: an investigation of the median p-value rule with applications to the Pulmonary Hypertension Association Registry and Colorado COVID-19 hospitalization data.广义加性模型的多重插补后推断：中位数 p 值规则的调查及其在肺动脉高压协会登记处和科罗拉多州 COVID-19 住院数据中的应用。

BMC Med Res Methodol. 2022 May 21;22(1):148. doi: 10.1186/s12874-022-01613-w.

A simple pooling method for variable selection in multiply imputed datasets outperformed complex methods.一种简单的池化方法在多重插补数据集的变量选择中表现优于复杂方法。

BMC Med Res Methodol. 2022 Aug 4;22(1):214. doi: 10.1186/s12874-022-01693-8.

Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors.Heckman 插补模型用于二分类或连续 MNAR 结局和 MAR 预测因子。

BMC Med Res Methodol. 2018 Aug 31;18(1):90. doi: 10.1186/s12874-018-0547-1.

Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis.类别协变量在多重插补后逻辑回归模型中的显著性检验方法：功效和适用性分析。

BMC Med Res Methodol. 2017 Aug 22;17(1):129. doi: 10.1186/s12874-017-0404-7.

Multiple imputation for non-response when estimating HIV prevalence using survey data.使用调查数据估计艾滋病毒流行率时对无应答情况的多重填补法

BMC Public Health. 2015 Oct 16;15:1059. doi: 10.1186/s12889-015-2390-1.

Pooling test statistics across multiply imputed datasets for nonnormal items.对非正态项目进行多重插补数据集的汇总检验统计量。

Behav Res Methods. 2024 Mar;56(3):1229-1243. doi: 10.3758/s13428-023-02088-3. Epub 2023 Mar 27.

The performance of prognostic models depended on the choice of missing value imputation algorithm: a simulation study.预后模型的性能取决于缺失值插补算法的选择：一项模拟研究。

J Clin Epidemiol. 2024 Dec;176:111539. doi: 10.1016/j.jclinepi.2024.111539. Epub 2024 Sep 24.

Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study.基于随机森林的插补方法在 LC-MS 代谢组学数据插补方面优于其他方法：一项比较研究。

BMC Bioinformatics. 2019 Oct 11;20(1):492. doi: 10.1186/s12859-019-3110-0.

Imputation of missing values of tumour stage in population-based cancer registration.基于人群的癌症登记中肿瘤分期缺失值的推断。

BMC Med Res Methodol. 2011 Sep 19;11:129. doi: 10.1186/1471-2288-11-129.

Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes: a simulation study.处理有序和二分类结局的医院间差异分析中的缺失值：一项模拟研究。

BMJ Qual Saf. 2023 Dec;32(12):742-749. doi: 10.1136/bmjqs-2023-016387. Epub 2023 Sep 21.

引用本文的文献

The impact of prenatal alcohol exposure on sleep outcomes in 10,336 young adolescents: An Adolescent Brain Cognitive Development (ABCD) Study.孕期酒精暴露对10336名青少年睡眠结果的影响：一项青少年大脑认知发展（ABCD）研究。

medRxiv. 2025 May 14:2025.05.14.25327575. doi: 10.1101/2025.05.14.25327575.

Evaluating the median -value method for assessing the statistical significance of tests when using multiple imputation.评估在使用多重填补时用于评估检验统计显著性的中位数法。

J Appl Stat. 2024 Oct 25;52(6):1161-1176. doi: 10.1080/02664763.2024.2418473. eCollection 2025.

Preliminary comparison of net gain in final adult height of girls with early menarche treated with or without gonadotropin-releasing hormone agonist.对接受或未接受促性腺激素释放激素激动剂治疗的初潮早发女孩最终成年身高净增长的初步比较。

Transl Pediatr. 2024 Dec 31;13(12):2204-2213. doi: 10.21037/tp-24-348. Epub 2024 Dec 27.

An interpretable and transparent machine learning framework for appendicitis detection in pediatric patients.一种用于儿科患者阑尾炎检测的可解释和透明的机器学习框架。

Sci Rep. 2024 Oct 18;14(1):24454. doi: 10.1038/s41598-024-75896-y.

Exploring the Interactions Between Psychotic Symptoms, Cognition, and Environmental Risk Factors: A Bayesian Analysis of Networks.探索精神病症状、认知与环境风险因素之间的相互作用：网络的贝叶斯分析

Schizophr Bull. 2025 Jul 7;51(4):1134-1145. doi: 10.1093/schbul/sbae174.

Sex differences in the role of sleep on cognition in older adults.老年人睡眠对认知作用中的性别差异。

Sleep Adv. 2024 Sep 3;5(1):zpae066. doi: 10.1093/sleepadvances/zpae066. eCollection 2024.

Addressing immortal time bias in precision medicine: Practical guidance and methods development.精准医学中解决不朽时间偏倚问题：实用指南与方法开发

Health Serv Res. 2025 Feb;60(1):e14376. doi: 10.1111/1475-6773.14376. Epub 2024 Sep 3.

Application of the IASP Grading System to Identify Underlying Pain Mechanisms in Patients With Knee Osteoarthritis: A Prospective Cohort Study.IASP 分级系统在识别膝骨关节炎患者潜在疼痛机制中的应用：一项前瞻性队列研究。

Clin J Pain. 2024 Oct 1;40(10):563-577. doi: 10.1097/AJP.0000000000001234.

Scanpro is a tool for robust proportion analysis of single-cell resolution data.Scanpro 是一种用于单细胞分辨率数据稳健比例分析的工具。

Sci Rep. 2024 Jul 6;14(1):15581. doi: 10.1038/s41598-024-66381-7.

Testing unit root non-stationarity in the presence of missing data in univariate time series of mobile health studies.在移动健康研究的单变量时间序列存在缺失数据的情况下检验单位根非平稳性。

J R Stat Soc Ser C Appl Stat. 2024 Feb 29;73(3):755-773. doi: 10.1093/jrsssc/qlae010. eCollection 2024 Jun.

本文引用的文献

The Predictive Potential of Elevated Serum Inflammatory Markers in Determining the Need for Intubation in CoVID-19 Patients.血清炎症标志物升高在确定COVID-19患者插管需求方面的预测潜力

J Crit Care Med (Targu Mures). 2021 Nov 13;8(1):14-22. doi: 10.2478/jccm-2021-0035. eCollection 2022 Jan.

Residence at moderately high altitude and its relationship with WHO Group 1 pulmonary arterial hypertension symptom severity and clinical characteristics: the Pulmonary Hypertension Association Registry.中度高海拔地区居住情况及其与世界卫生组织第1组肺动脉高压症状严重程度和临床特征的关系：肺动脉高压协会注册研究

Pulm Circ. 2020 Nov 10;10(4):2045894020964342. doi: 10.1177/2045894020964342. eCollection 2020 Oct-Dec.

Cauchy combination test: a powerful test with analytic -value calculation under arbitrary dependency structures.柯西组合检验：一种在任意相依结构下具有解析值计算功能的强大检验。

J Am Stat Assoc. 2020;115(529):393-402. doi: 10.1080/01621459.2018.1554485. Epub 2019 Apr 25.

BMC Med Res Methodol. 2017 Aug 22;17(1):129. doi: 10.1186/s12874-017-0404-7.

Multiple imputation by chained equations: what is it and how does it work?多重链结方程插补法：是什么，以及它如何运作？

Int J Methods Psychiatr Res. 2011 Mar;20(1):40-9. doi: 10.1002/mpr.329.

Missing data analysis: making it work in the real world.缺失数据分析：使其在现实世界中发挥作用。

Annu Rev Psychol. 2009;60:549-76. doi: 10.1146/annurev.psych.58.110405.085530.

Multiple imputation: a primer.多重填补：入门指南。

Stat Methods Med Res. 1999 Mar;8(1):3-15. doi: 10.1177/096228029900800102.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

广义加性模型的多重插补后推断：中位数 p 值规则的调查及其在肺动脉高压协会登记处和科罗拉多州 COVID-19 住院数据中的应用。

Inference following multiple imputation for generalized additive models: an investigation of the median p-value rule with applications to the Pulmonary Hypertension Association Registry and Colorado COVID-19 hospitalization data.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献