Suppr超能文献

关于缺失数据的假设对结论有何影响?一项针对癌症生存登记处的实际敏感性分析。

What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry.

作者信息

Smuk M, Carpenter J R, Morris T P

机构信息

Centre for Psychiatry, Queen Mary University of London, Charterhouse Sqaure, London, EC1M 6BQ, UK.

Medical Statistics Department, London School of Hygiene and Tropical Medicine, London, UK.

出版信息

BMC Med Res Methodol. 2017 Feb 6;17(1):21. doi: 10.1186/s12874-017-0301-0.

Abstract

BACKGROUND

Within epidemiological and clinical research, missing data are a common issue and often over looked in publications. When the issue of missing observations is addressed it is usually assumed that the missing data are 'missing at random' (MAR). This assumption should be checked for plausibility, however it is untestable, thus inferences should be assessed for robustness to departures from missing at random.

METHODS

We highlight the method of pattern mixture sensitivity analysis after multiple imputation using colorectal cancer data as an example. We focus on the Dukes' stage variable which has the highest proportion of missing observations. First, we find the probability of being in each Dukes' stage given the MAR imputed dataset. We use these probabilities in a questionnaire to elicit prior beliefs from experts on what they believe the probability would be in the missing data. The questionnaire responses are then used in a Dirichlet draw to create a Bayesian 'missing not at random' (MNAR) prior to impute the missing observations. The model of interest is applied and inferences are compared to those from the MAR imputed data.

RESULTS

The inferences were largely insensitive to departure from MAR. Inferences under MNAR suggested a smaller association between Dukes' stage and death, though the association remained positive and with similarly low p values.

CONCLUSIONS

We conclude by discussing the positives and negatives of our method and highlight the importance of making people aware of the need to test the MAR assumption.

摘要

背景

在流行病学和临床研究中,数据缺失是一个常见问题,且在出版物中常常被忽视。当处理缺失观测值问题时,通常假定缺失数据是“随机缺失”(MAR)。然而,这一假设应检验其合理性,但它无法检验,因此应对推断结果针对偏离随机缺失的情况进行稳健性评估。

方法

我们以结直肠癌数据为例,重点介绍多重填补后模式混合敏感性分析方法。我们关注缺失观测值比例最高的Dukes分期变量。首先,在MAR填补数据集的基础上,我们求出处于各Dukes分期的概率。我们在一份问卷中使用这些概率,以获取专家对于他们认为缺失数据中概率会是多少的先验信念。然后,将问卷回复用于狄利克雷抽样,以创建一个贝叶斯“非随机缺失”(MNAR)先验,用于填补缺失观测值。应用感兴趣的模型,并将推断结果与MAR填补数据的推断结果进行比较。

结果

推断结果在很大程度上对偏离MAR不敏感。MNAR条件下的推断表明Dukes分期与死亡之间的关联较小,尽管该关联仍为正且p值同样较低。

结论

我们通过讨论我们方法的优缺点来得出结论,并强调让人们意识到检验MAR假设必要性的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c9c/5294884/b0a608fcbd6f/12874_2017_301_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验