Suppr超能文献

基于人群的癌症生存分析的数据质量控制

Control of data quality for population-based cancer survival analysis.

作者信息

Li Ruoran, Abela Louise, Moore Jonathan, Woods Laura M, Nur Ula, Rachet Bernard, Allemani Claudia, Coleman Michel P

机构信息

Cancer Research UK Cancer Survival Group, London School of Hygiene and Tropical Medicine, London, UK.

Cancer Research UK Cancer Survival Group, London School of Hygiene and Tropical Medicine, London, UK.

出版信息

Cancer Epidemiol. 2014 Jun;38(3):314-20. doi: 10.1016/j.canep.2014.02.013. Epub 2014 Mar 29.

Abstract

BACKGROUND

Population-based cancer survival is an important measure of the overall effectiveness of cancer care in a population. Population-based cancer registries collect data that enable the estimation of cancer survival. To ensure accurate, consistent and comparable survival estimates, strict control of data quality is required before the survival analyses are carried out. In this paper, we present a basis for data quality control for cancer survival.

METHODS

We propose three distinct phases for the quality control. Firstly, each individual variable within a given record is examined to identify departures from the study protocol; secondly, each record is checked and excluded if it is ineligible or logically incoherent for analysis; lastly, the distributions of key characteristics in the whole dataset are examined for their plausibility.

RESULTS

Data for patients diagnosed with bladder cancer in England between 1991 and 2010 are used as an example to aid the interpretation of the differences in data quality. The effect of different aspects of data quality on survival estimates is discussed.

CONCLUSIONS

We recommend that the results of data quality procedures should be reported together with the findings from survival analysis, to facilitate their interpretation.

摘要

背景

基于人群的癌症生存率是衡量人群中癌症治疗总体效果的一项重要指标。基于人群的癌症登记处收集的数据可用于估算癌症生存率。为确保生存率估算准确、一致且具有可比性,在进行生存分析之前需要严格控制数据质量。在本文中,我们提出了癌症生存数据质量控制的依据。

方法

我们提出了三个不同的质量控制阶段。首先,检查给定记录中的每个单独变量,以识别与研究方案的偏差;其次,检查每条记录,如果其不符合分析要求或在逻辑上不连贯,则将其排除;最后,检查整个数据集的关键特征分布是否合理。

结果

以1991年至2010年期间在英格兰被诊断为膀胱癌的患者数据为例,以帮助解释数据质量的差异。讨论了数据质量不同方面对生存率估算的影响。

结论

我们建议应将数据质量程序的结果与生存分析的结果一起报告,以便于解释。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验