Jing Bowen, Wang Kai, Schmitz Erich, Tang Shanshan, Li Yunxiang, Zhang You, Wang Jing
Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, USA.
Advanced Imaging and Informatics for Radiation Therapy (AIRT) Lab, University of Texas Southwestern Medical Center, Dallas, Texas, USA.
Med Phys. 2024 Dec;51(12):9385-9393. doi: 10.1002/mp.17451. Epub 2024 Oct 6.
The I-SPY 2 trial is a national-wide, multi-institutional clinical trial designed to evaluate multiple new therapeutic drugs for high-risk breast cancer. Previous studies suggest that pathological complete response (pCR) is a viable indicator of long-term outcomes of neoadjuvant chemotherapy for high-risk breast cancer. While pCR can be assessed during surgery after the chemotherapy, early prediction of pCR before the completion of the chemotherapy may facilitate personalized treatment management to achieve an improved outcome. Notably, the acquisition of dynamic contrast-enhanced magnetic resonance (DCEMR) images at multiple time points during the I-SPY 2 trial opens up the possibility of achieving early pCR prediction.
In this study, we investigated the feasibility of the early prediction of pCR to neoadjuvant chemotherapy using multi-time point DCEMR images and clinical data acquired in the I-SPY2 trial. The prediction uncertainty was also quantified to allow physicians to make patient-specific decisions on treatment plans based on the level of associated uncertainty.
The dataset used in our study included 624 patients with DCEMR images acquired at 3 time points before the completion of the chemotherapy: pretreatment (T0), after 3 cycles of treatment (T1), and after 12 cycles of treatment (T2). A convolutional long short-term memory (LSTM) network-based deep learning model, which integrated multi-time point deep image representations with clinical data, including tumor subtypes, was developed to predict pCR. The performance of the model was evaluated via the method of nested 5-fold cross validation. Moreover, we also quantified prediction uncertainty for each patient through test-time augmentation. To investigate the relationship between predictive performance and uncertainty, the area under the receiver operating characteristic curve (AUROC) was assessed on subgroups of patients stratified by the uncertainty score.
By integrating clinical data and DCEMR images obtained at three-time points before treatment completion, the AUROC reached 0.833 with a sensitivity of 0.723 and specificity of 0.800. This performance was significantly superior (p < 0.01) to models using only images (AUROC = 0.706) or only clinical data (AUROC = 0.746). After stratifying the patients into eight subgroups based on the uncertainty score, we found that group #1, with the lowest uncertainty, had a superior AUROC of 0.873. The AUROC decreased to 0.637 for group #8, which had the highest uncertainty.
The results indicate that our convolutional LSTM network-based deep learning model can be used to predict pCR earlier before the completion of chemotherapy. By combining clinical data and multi-time point deep image representations, our model outperforms models built solely on clinical or image data. Estimating prediction uncertainty may enable physicians to prioritize or disregard predictions based on their associated uncertainties. This approach could potentially enhance the personalization of breast cancer therapy.
I-SPY 2试验是一项全国性、多机构的临床试验,旨在评估多种用于高危乳腺癌的新型治疗药物。先前的研究表明,病理完全缓解(pCR)是高危乳腺癌新辅助化疗长期预后的一个可行指标。虽然pCR可在化疗后的手术中进行评估,但在化疗完成前对pCR进行早期预测可能有助于个性化治疗管理,从而实现更好的预后。值得注意的是,在I-SPY 2试验期间多个时间点采集的动态对比增强磁共振(DCEMR)图像为实现pCR早期预测提供了可能。
在本研究中,我们利用I-SPY2试验中获取的多时间点DCEMR图像和临床数据,研究新辅助化疗pCR早期预测的可行性。还对预测不确定性进行了量化,以便医生能够根据相关不确定性水平做出针对患者的治疗方案决策。
我们研究中使用的数据集包括624例患者,这些患者在化疗完成前的3个时间点采集了DCEMR图像:治疗前(T0)、3个治疗周期后(T1)和12个治疗周期后(T2)。开发了一种基于卷积长短期记忆(LSTM)网络的深度学习模型,该模型将多时间点深度图像特征与包括肿瘤亚型在内的临床数据相结合,以预测pCR。通过嵌套5折交叉验证方法评估模型的性能。此外,我们还通过测试时增强对每位患者的预测不确定性进行了量化。为了研究预测性能与不确定性之间的关系,在按不确定性评分分层的患者亚组中评估了受试者操作特征曲线下面积(AUROC)。
通过整合治疗完成前三个时间点获得的临床数据和DCEMR图像,AUROC达到0.833,敏感性为0.723,特异性为0.800。该性能显著优于仅使用图像(AUROC = 0.706)或仅使用临床数据(AUROC = 0.746)的模型(p < 0.01)。根据不确定性评分将患者分为八个亚组后我们发现,不确定性最低的第1组具有0.873的优异AUROC。不确定性最高的第8组的AUROC降至0.637。
结果表明,我们基于卷积LSTM网络的深度学习模型可用于在化疗完成前更早地预测pCR。通过结合临床数据和多时间点深度图像特征,我们的模型优于仅基于临床或图像数据构建的模型。估计预测不确定性可能使医生能够根据相关不确定性对预测进行优先排序或不予考虑。这种方法可能会增强乳腺癌治疗的个性化。