Bazgir Omid, Lu James
Modeling & Simulation/Clinical Pharmacology, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA.
iScience. 2023 Aug 17;26(9):107627. doi: 10.1016/j.isci.2023.107627. eCollection 2023 Sep 15.
Robust and accurate survival prediction of clinical trials using high-throughput genomics data is a fundamental challenge in pharmacogenomics. Current machine learning tools often provide limited predictive performance and model interpretation in these settings. In the present study, we extend the application of REFINED-CNN from regression tasks to making survival predictions, by mapping high-dimensional RNA sequencing data into REFINED images which are conducive to CNN modeling. We show that the REFINED-CNN survival model can be easily adapted to new tasks of a similar nature (e.g., predicting on new cancer types) using transfer learning with a low number of patients. Furthermore, the model can also be interpreted both locally and globally through risk score back propagation that quantifies each feature (e.g., gene) importance in survival prediction task for the patient or cancer type of interest.
利用高通量基因组学数据对临床试验进行稳健且准确的生存预测是药物基因组学中的一项基本挑战。在这些情况下,当前的机器学习工具通常提供有限的预测性能和模型解释。在本研究中,我们将REFINED-CNN的应用从回归任务扩展到进行生存预测,方法是将高维RNA测序数据映射到有利于CNN建模的REFINED图像中。我们表明,REFINED-CNN生存模型可以通过使用少量患者的迁移学习轻松适应类似性质的新任务(例如,对新的癌症类型进行预测)。此外,该模型还可以通过风险评分反向传播在局部和全局层面进行解释,风险评分反向传播量化了每个特征(例如基因)在感兴趣的患者或癌症类型的生存预测任务中的重要性。