整群随机试验中缺失数据的统计分析与处理:一项系统综述

Statistical analysis and handling of missing data in cluster randomized trials: a systematic review.

作者信息

Fiero Mallorie H, Huang Shuang, Oren Eyal, Bell Melanie L

机构信息

Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, 1295 N. Martin Ave., Drachman Hall, P.O. Box 245163, Tucson, Arizona, 85724, USA.

出版信息

Trials. 2016 Feb 9;17:72. doi: 10.1186/s13063-016-1201-z.

Abstract

BACKGROUND

Cluster randomized trials (CRTs) randomize participants in groups, rather than as individuals and are key tools used to assess interventions in health research where treatment contamination is likely or if individual randomization is not feasible. Two potential major pitfalls exist regarding CRTs, namely handling missing data and not accounting for clustering in the primary analysis. The aim of this review was to evaluate approaches for handling missing data and statistical analysis with respect to the primary outcome in CRTs.

METHODS

We systematically searched for CRTs published between August 2013 and July 2014 using PubMed, Web of Science, and PsycINFO. For each trial, two independent reviewers assessed the extent of the missing data and method(s) used for handling missing data in the primary and sensitivity analyses. We evaluated the primary analysis and determined whether it was at the cluster or individual level.

RESULTS

Of the 86 included CRTs, 80 (93%) trials reported some missing outcome data. Of those reporting missing data, the median percent of individuals with a missing outcome was 19% (range 0.5 to 90%). The most common way to handle missing data in the primary analysis was complete case analysis (44, 55%), whereas 18 (22%) used mixed models, six (8%) used single imputation, four (5%) used unweighted generalized estimating equations, and two (2%) used multiple imputation. Fourteen (16%) trials reported a sensitivity analysis for missing data, but most assumed the same missing data mechanism as in the primary analysis. Overall, 67 (78%) trials accounted for clustering in the primary analysis.

CONCLUSIONS

High rates of missing outcome data are present in the majority of CRTs, yet handling missing data in practice remains suboptimal. Researchers and applied statisticians should carry out appropriate missing data methods, which are valid under plausible assumptions in order to increase statistical power in trials and reduce the possibility of bias. Sensitivity analysis should be performed, with weakened assumptions regarding the missing data mechanism to explore the robustness of results reported in the primary analysis.

摘要

背景

整群随机试验(CRT)将参与者按组进行随机分组,而非个体随机分组,是在健康研究中评估干预措施的关键工具,适用于可能存在治疗污染或个体随机化不可行的情况。整群随机试验存在两个潜在的主要缺陷,即处理缺失数据以及在主要分析中未考虑聚类因素。本综述的目的是评估整群随机试验中处理缺失数据的方法以及针对主要结局的统计分析。

方法

我们使用PubMed、科学网和PsycINFO系统检索了2013年8月至2014年7月发表的整群随机试验。对于每个试验,两名独立的评审员评估了缺失数据的程度以及在主要分析和敏感性分析中用于处理缺失数据的方法。我们评估了主要分析并确定其是在整群水平还是个体水平上进行。

结果

在纳入的86项整群随机试验中,80项(93%)试验报告了一些结局数据缺失。在报告缺失数据的试验中,结局缺失个体的中位数百分比为19%(范围为0.5%至90%)。在主要分析中处理缺失数据最常见的方法是完全病例分析(44项,55%),而18项(22%)使用混合模型,6项(8%)使用单一插补,4项(5%)使用未加权广义估计方程,2项(2%)使用多重插补。14项(16%)试验报告了针对缺失数据的敏感性分析,但大多数假定与主要分析中相同的缺失数据机制。总体而言,67项(78%)试验在主要分析中考虑了聚类因素。

结论

大多数整群随机试验中存在较高比例的结局数据缺失,但在实际操作中处理缺失数据仍不尽人意。研究人员和应用统计学家应采用适当的缺失数据方法,这些方法在合理假设下有效,以提高试验的统计效力并降低偏倚的可能性。应进行敏感性分析,对缺失数据机制的假设进行弱化,以探讨主要分析中报告结果的稳健性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/97fc/4748550/b5b617ac8b4f/13063_2016_1201_Fig1_HTML.jpg

引用本文的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索