Suppr超能文献

在肿瘤学观察性生存时间研究中,如何处理协变量中的缺失数据?一项系统评价。

How are missing data in covariates handled in observational time-to-event studies in oncology? A systematic review.

机构信息

Department of Medical Statistics, London School of Hygiene and Tropical Medicine, Keppel Street, London, UK.

MRC Clinical Trials Unit at UCL, 90 High Holborn, London, UK.

出版信息

BMC Med Res Methodol. 2020 May 29;20(1):134. doi: 10.1186/s12874-020-01018-7.

Abstract

BACKGROUND

Missing data in covariates can result in biased estimates and loss of power to detect associations. It can also lead to other challenges in time-to-event analyses including the handling of time-varying effects of covariates, selection of covariates and their flexible modelling. This review aims to describe how researchers approach time-to-event analyses with missing data.

METHODS

Medline and Embase were searched for observational time-to-event studies in oncology published from January 2012 to January 2018. The review focused on proportional hazards models or extended Cox models. We investigated the extent and reporting of missing data and how it was addressed in the analysis. Covariate modelling and selection, and assessment of the proportional hazards assumption were also investigated, alongside the treatment of missing data in these procedures.

RESULTS

148 studies were included. The mean proportion of individuals with missingness in any covariate was 32%. 53% of studies used complete-case analysis, and 22% used multiple imputation. In total, 14% of studies stated an assumption concerning missing data and only 34% stated missingness as a limitation. The proportional hazards assumption was checked in 28% of studies, of which, 17% did not state the assessment method. 58% of 144 multivariable models stated their covariate selection procedure with use of a pre-selected set of covariates being the most popular followed by stepwise methods and univariable analyses. Of 69 studies that included continuous covariates, 81% did not assess the appropriateness of the functional form.

CONCLUSION

While guidelines for handling missing data in epidemiological studies are in place, this review indicates that few report implementing recommendations in practice. Although missing data are present in many studies, we found that few state clearly how they handled it or the assumptions they have made. Easy-to-implement but potentially biased approaches such as complete-case analysis are most commonly used despite these relying on strong assumptions and where often more appropriate methods should be employed. Authors should be encouraged to follow existing guidelines to address missing data, and increased levels of expectation from journals and editors could be used to improve practice.

摘要

背景

协变量中的缺失数据可能导致有偏估计和检测关联的能力下降。它还可能导致事件时间分析中的其他挑战,包括协变量时变效应的处理、协变量的选择及其灵活建模。本综述旨在描述研究人员如何处理缺失数据的事件时间分析。

方法

对 2012 年 1 月至 2018 年 1 月发表的肿瘤学观察性事件时间研究进行了 Medline 和 Embase 检索。本综述重点关注比例风险模型或扩展 Cox 模型。我们调查了缺失数据的程度和报告情况,以及在分析中如何处理这些数据。还调查了协变量建模和选择,以及对比例风险假设的评估,以及在这些过程中如何处理缺失数据。

结果

共纳入 148 项研究。任何协变量缺失的个体平均比例为 32%。53%的研究使用完全案例分析,22%使用多重插补。共有 14%的研究陈述了关于缺失数据的假设,只有 34%的研究将缺失数据视为局限性。28%的研究检查了比例风险假设,其中 17%没有说明评估方法。144 个多变量模型中有 58%陈述了其协变量选择过程,最受欢迎的是使用预先选择的协变量集,其次是逐步方法和单变量分析。在纳入连续协变量的 69 项研究中,81%没有评估函数形式的适当性。

结论

虽然处理流行病学研究中缺失数据的指南已经存在,但本综述表明,很少有研究报告在实践中实施建议。尽管许多研究中存在缺失数据,但我们发现很少有研究清楚地说明他们如何处理这些数据,以及他们做出了哪些假设。尽管这些方法依赖于强有力的假设,并且通常应该采用更合适的方法,但很常见的是采用易于实施但可能有偏的方法,如完全案例分析。应该鼓励作者遵循现有的指南来处理缺失数据,并且可以提高期刊和编辑的期望水平,以提高实践水平。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验