Department of Oncology, University of Oxford, Oxford, UK.
Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK.
Lancet Oncol. 2024 Nov;25(11):1476-1486. doi: 10.1016/S1470-2045(24)00497-2. Epub 2024 Oct 9.
Cancer places a high burden on society and health-care systems. Cancer research requires high-quality data, which is resource-intensive to obtain. Using administrative datasets such as cancer registries could improve the efficiency of cancer studies if data were valid and timely. We aimed to compare the validity and timeliness of diagnostic cancer data on-site during the SYMPLIFY study to that obtained from the cancer registries of England and Wales.
Cancer data were collected from 5461 participants across 44 hospital sites during a prospective observational study in England and Wales, SYMPLIFY (ISRCTN10226380). Linked cancer data were obtained from Digital Health and Care Wales (DHCW), the Welsh Cancer Intelligence and Surveillance Unit (WCISU), and the English National Cancer Registration Dataset (NCRD) and Rapid Cancer Registration Dataset (RCRD), regularly between April, 2022, and September, 2023. The primary objectives of the study were to evaluate the validity (via assessment of the proportion of completed data fields and concordance with SYMPLIFY sites), and timeliness of the data in all datasets, for all cancers diagnosed within 9 months of study enrolment. Data fields investigated were cancer site via International Classification of Disease, 10th Revision (ICD-10) code; cancer morphology via International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3) morphology histology code and broad morphological grouping; overall stage; and TNM classification.
For data collected between April, 2022, and September, 2023, completeness at the last data cut available for each dataset ranged from 84% to 100% for ICD-O-3 morphology, from 43% to 100% for overall stage, and from 74% to 83% for TNM stage. The concordance between SYMPLIFY data and NCRD was 96% (95% CI 92-98) for ICD-10, 60% (53-66) for ICD-O-3 morphology, 83% (78-88) for ICD-O-3 broad morphology groupings, 73% (67-78) for stage, and 51% (44-59) for TNM; and with WCISU was 89% (95% CI 81-94) for ICD-10, 63% (53-73) for ICD-O-3 morphology, 80% (70-87) for ICD-O-3 broad morphology groupings, 83% (74-90) for overall stage, and 49% (38-61) for TNM stage. Concordance between SYMPLIFY and RCRD was 95% (95% CI 92-98) for ICD-10, 67% (60-74) for ICD-O-3 morphology, 85% (79-90) for ICD-O-3 broad morphology groupings, and 73% (65-80) for overall stage; and between SYMPLIFY and DHCW was 96% (91-99) for ICD-10, 74% (64-83) for ICD-O-3 morphology, 84% (75-91) for ICD-O-3 broad morphology groupings, and 87% (74-95) for stage. The SYMPLIFY dataset reached completion at 12 months post-enrolment in November, 2022, compared with 13 months for NCRD in December, 2023. RCRD and DHCW reached completion at 13 months and 15 months post-enrolment, in December, 2022, and February, 2023, respectively.
We report similar completeness of data fields, concordance, and timeliness between on-site and centrally collected cancer outcomes data. Our findings suggest that central registry data can help alleviate the resource burden in clinical trials and improve cancer research. Cancer registries might need additional resources to provide data for registry-based trials at scale.
GRAIL Bio UK.
癌症给社会和医疗保健系统带来了沉重负担。癌症研究需要高质量的数据,而获取这些数据需要耗费大量资源。如果数据有效且及时,使用癌症登记等行政数据集可以提高癌症研究的效率。我们旨在比较 SYMPLIFY 研究中现场收集的诊断癌症数据与英格兰和威尔士癌症登记处获得的数据的有效性和及时性。
在英格兰和威尔士进行的一项前瞻性观察研究 SYMPLIFY(ISRCTN83231033)中,从 44 家医院的 5461 名参与者中收集癌症数据。通过数字健康和护理威尔士(DHCW)、威尔士癌症情报和监测单位(WCISU)和英格兰国家癌症登记数据集(NCRD)和快速癌症登记数据集(RCRD)定期获得链接的癌症数据,时间为 2022 年 4 月至 2023 年 9 月。该研究的主要目的是评估所有数据集在所有癌症诊断后 9 个月内的有效性(通过评估完成数据字段的比例和与 SYMPLIFY 站点的一致性)和及时性。研究中调查的数据字段包括国际疾病分类第 10 版(ICD-10)代码的癌症部位;国际肿瘤学疾病分类第 3 版(ICD-O-3)形态组织学代码和广泛的形态分组;整体阶段;和 TNM 分类。
对于 2022 年 4 月至 2023 年 9 月期间收集的数据,每个数据集的最后一个可用数据截止日期的完整性范围为 ICD-O-3 形态学为 84%至 100%,整体阶段为 43%至 100%,TNM 阶段为 74%至 83%。SYMPLIFY 数据与 NCRD 的一致性为 ICD-10 为 96%(95%CI 92-98),ICD-O-3 形态学为 60%(53-66),ICD-O-3 广泛形态分组为 83%(78-88),阶段为 73%(67-78),TNM 为 51%(44-59);与 WCISU 的一致性为 ICD-10 为 89%(95%CI 81-94),ICD-O-3 形态学为 63%(53-73),ICD-O-3 广泛形态分组为 80%(70-87),整体阶段为 83%(74-90),TNM 阶段为 49%(38-61)。SYMPLIFY 与 RCRD 的一致性为 ICD-10 为 95%(95%CI 92-98),ICD-O-3 形态学为 67%(60-74),ICD-O-3 广泛形态分组为 85%(79-90),整体阶段为 73%(65-80);与 DHCW 的一致性为 ICD-10 为 96%(91-99),ICD-O-3 形态学为 74%(64-83),ICD-O-3 广泛形态分组为 84%(75-91),阶段为 87%(74-95)。SYMPLIFY 数据集在 2022 年 11 月,即入组后 12 个月完成,而 NCRD 在 2023 年 12 月,即入组后 13 个月完成。RCRD 和 DHCW 分别在 2022 年 12 月和 2023 年 2 月,即入组后 13 个月和 15 个月完成。
我们报告了现场和中央收集的癌症结局数据之间相似的数据字段完整性、一致性和及时性。我们的发现表明,中央登记处数据可以帮助减轻临床试验中的资源负担,并改善癌症研究。癌症登记处可能需要额外的资源来大规模提供基于登记处的试验的数据。
GRAIL Bio UK。