South Western Sydney Clinical School, Faculty of Medicine, UNSW, Sydney, New South Wales, Australia.
Ingham Institute for Applied Medical Research, Liverpool, New South Wales, Australia.
J Med Imaging Radiat Oncol. 2021 Aug;65(5):627-636. doi: 10.1111/1754-9485.13287. Epub 2021 Jul 31.
There is significant potential to analyse and model routinely collected data for radiotherapy patients to provide evidence to support clinical decisions, particularly where clinical trials evidence is limited or non-existent. However, in practice there are administrative, ethical, technical, logistical and legislative barriers to having coordinated data analysis platforms across radiation oncology centres.
A distributed learning network of computer systems is presented, with software tools to extract and report on oncology data and to enable statistical model development. A distributed or federated learning approach keeps data in the local centre, but models are developed from the entire cohort.
The feasibility of this approach is demonstrated across six Australian oncology centres, using routinely collected lung cancer data from oncology information systems. The infrastructure was used to validate and develop machine learning for model-based clinical decision support and for one centre to assess patient eligibility criteria for two major lung cancer radiotherapy clinical trials (RTOG-9410, RTOG-0617). External validation of a 2-year overall survival model for non-small cell lung cancer (NSCLC) gave an AUC of 0.65 and C-index of 0.62 across the network. For one centre, 65% of Stage III NSCLC patients did not meet eligibility criteria for either of the two practice-changing clinical trials, and these patients had poorer survival than eligible patients (10.6 m vs. 15.8 m, P = 0.024).
Population-based studies on routine data are possible using a distributed learning approach. This has the potential for decision support models for patients for whom supporting clinical trial evidence is not applicable.
对放射肿瘤患者的常规收集数据进行分析和建模具有很大的潜力,可以提供支持临床决策的证据,特别是在临床试验证据有限或不存在的情况下。然而,在实践中,由于行政、伦理、技术、后勤和立法方面的障碍,放射肿瘤中心之间很难协调数据分析平台。
提出了一种分布式学习网络计算机系统,具有提取和报告肿瘤学数据以及开发统计模型的软件工具。分布式或联邦学习方法将数据保留在本地中心,但模型是从整个队列中开发的。
该方法在澳大利亚六个肿瘤中心使用肿瘤信息系统的常规肺癌数据进行了验证,证明了该方法的可行性。该基础设施用于验证和开发基于模型的临床决策支持的机器学习,以及一个中心用于评估两个主要肺癌放射治疗临床试验(RTOG-9410、RTOG-0617)的患者入选标准。对非小细胞肺癌(NSCLC)2 年总生存率模型的外部验证显示,在整个网络中 AUC 为 0.65,C 指数为 0.62。对于一个中心,65%的 III 期 NSCLC 患者不符合两项改变实践的临床试验的入选标准,这些患者的生存率比符合入选标准的患者差(10.6m 与 15.8m,P=0.024)。
使用分布式学习方法对常规数据进行基于人群的研究是可行的。这有可能为那些不适用支持临床试验证据的患者提供决策支持模型。