Suppr超能文献

[如何处理缺失数据?链式方程多重填补:临床实践的建议与解释]

[How to deal with missing data? Multiple imputation by chained equations: recommendations and explanations for clinical practice].

作者信息

Legendre Bruno, Cerasuolo Damiano, Dejardin Olivier, Boyer Annabel

机构信息

Centre hospitalier universitaire de Caen, service de néphrologie, dialyse et transplantation, avenue de la Délivrande, 14000 Caen, France

Inserm U1086 ANTICIPE, Caen, France

出版信息

Nephrol Ther. 2023 Jun 19;19(3):171-179. doi: 10.1684/ndt.2023.24.

Abstract

The presence of missing data, a constant problem in medical research, has several consequences: systematic loss of power, associated or not with a reduction in the representativeness of the sample analyzed. There are three types of missing data: 1) missing completely at random (MCAR); 2) missing at random (MAR); 3) missing not at random (MNAR). Multiple imputation by chained equations allows for the correct handling of missing data under the MCAR and MAR assumptions. It allows to simulate for each missing data j, a number m of simulated values which seem plausible with regard to the other variables. A random effect is included in this simulation to express the uncertainty. Several data sets are thus created and analyzed individually, in an identical way. Then the estimators of each data set are combined to obtain a global estimator. Multiple imputation increases power, corrects for some biases and has the advantage of being applicable to many types of variables. Complete case analysis should no longer be the norm. The objective of this guide is to help the reader in conducting an analysis with multiple imputed data. We cover the following points: the different types of missing data, the different historical approaches to handling them, and then we detail the multiple imputation method using chained equations. We provide a code example for the mice package of R®.

摘要

缺失数据的存在是医学研究中一直存在的问题,会产生多种后果:系统性的效能损失,这可能与所分析样本代表性的降低有关,也可能无关。缺失数据有三种类型:1)完全随机缺失(MCAR);2)随机缺失(MAR);3)非随机缺失(MNAR)。链式方程多重填补法允许在MCAR和MAR假设下正确处理缺失数据。它允许为每个缺失数据j模拟m个关于其他变量看似合理的模拟值。在该模拟中纳入随机效应以表达不确定性。这样就创建了几个数据集,并以相同方式分别进行分析。然后将每个数据集的估计量合并以获得全局估计量。多重填补法提高了效能,校正了一些偏差,并且具有适用于多种类型变量的优点。完全病例分析不应再作为常规方法。本指南的目的是帮助读者进行多重填补数据分析。我们涵盖以下几点:缺失数据的不同类型、处理它们的不同历史方法,然后详细介绍使用链式方程的多重填补法。我们提供了R®软件mice包的代码示例。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验