Suppr超能文献

在基于人群的癌症登记数据中,对诊断时“未知”分期使用多重填补法的有效性。

Validity of using multiple imputation for "unknown" stage at diagnosis in population-based cancer registry data.

作者信息

Luo Qingwei, Egger Sam, Yu Xue Qin, Smith David P, O'Connell Dianne L

机构信息

Cancer Research Division, Cancer Council NSW, Sydney, Australia.

Sydney School of Public Health, University of Sydney, Sydney, Australia.

出版信息

PLoS One. 2017 Jun 27;12(6):e0180033. doi: 10.1371/journal.pone.0180033. eCollection 2017.

Abstract

BACKGROUND

The multiple imputation approach to missing data has been validated by a number of simulation studies by artificially inducing missingness on fully observed stage data under a pre-specified missing data mechanism. However, the validity of multiple imputation has not yet been assessed using real data. The objective of this study was to assess the validity of using multiple imputation for "unknown" prostate cancer stage recorded in the New South Wales Cancer Registry (NSWCR) in real-world conditions.

METHODS

Data from the population-based cohort study NSW Prostate Cancer Care and Outcomes Study (PCOS) were linked to 2000-2002 NSWCR data. For cases with "unknown" NSWCR stage, PCOS-stage was extracted from clinical notes. Logistic regression was used to evaluate the missing at random assumption adjusted for variables from two imputation models: a basic model including NSWCR variables only and an enhanced model including the same NSWCR variables together with PCOS primary treatment. Cox regression was used to evaluate the performance of MI.

RESULTS

Of the 1864 prostate cancer cases 32.7% were recorded as having "unknown" NSWCR stage. The missing at random assumption was satisfied when the logistic regression included the variables included in the enhanced model, but not those in the basic model only. The Cox models using data with imputed stage from either imputation model provided generally similar estimated hazard ratios but with wider confidence intervals compared with those derived from analysis of the data with PCOS-stage. However, the complete-case analysis of the data provided a considerably higher estimated hazard ratio for the low socio-economic status group and rural areas in comparison with those obtained from all other datasets.

CONCLUSIONS

Using MI to deal with "unknown" stage data recorded in a population-based cancer registry appears to provide valid estimates. We would recommend a cautious approach to the use of this method elsewhere.

摘要

背景

缺失数据的多重填补方法已通过多项模拟研究得到验证,这些研究通过在预先指定的缺失数据机制下对完全观测的阶段数据人为引入缺失值来进行。然而,多重填补的有效性尚未使用真实数据进行评估。本研究的目的是在现实世界条件下评估对新南威尔士州癌症登记处(NSWCR)记录的“未知”前列腺癌分期使用多重填补的有效性。

方法

基于人群的队列研究新南威尔士州前列腺癌护理与结局研究(PCOS)的数据与2000 - 2002年的NSWCR数据相链接。对于NSWCR分期为“未知”的病例,从临床记录中提取PCOS分期。使用逻辑回归来评估针对来自两个填补模型的变量调整后的随机缺失假设:一个仅包括NSWCR变量的基本模型和一个包括相同NSWCR变量以及PCOS初始治疗的增强模型。使用Cox回归来评估多重填补的性能。

结果

在1864例前列腺癌病例中,32.7%被记录为NSWCR分期“未知”。当逻辑回归纳入增强模型中的变量时,随机缺失假设得到满足,但仅纳入基本模型中的变量时则不满足。使用来自任一填补模型的填补分期数据的Cox模型提供的估计风险比通常相似,但与使用PCOS分期数据进行分析得出的置信区间相比更宽。然而,对数据进行的完整病例分析显示,与从所有其他数据集获得的结果相比,低社会经济地位组和农村地区的估计风险比要高得多。

结论

使用多重填补来处理基于人群的癌症登记处记录的“未知”分期数据似乎能提供有效的估计。我们建议在其他地方谨慎使用此方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4cc/5487067/98f0b5e72889/pone.0180033.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验