Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.
Nat Commun. 2021 Mar 19;12(1):1740. doi: 10.1038/s41467-021-21997-5.
Drug response differs substantially in cancer patients due to inter- and intra-tumor heterogeneity. Particularly, transcriptome context, especially tumor microenvironment, has been shown playing a significant role in shaping the actual treatment outcome. In this study, we develop a deep variational autoencoder (VAE) model to compress thousands of genes into latent vectors in a low-dimensional space. We then demonstrate that these encoded vectors could accurately impute drug response, outperform standard signature-gene based approaches, and appropriately control the overfitting problem. We apply rigorous quality assessment and validation, including assessing the impact of cell line lineage, cross-validation, cross-panel evaluation, and application in independent clinical data sets, to warrant the accuracy of the imputed drug response in both cell lines and cancer samples. Specifically, the expression-regulated component (EReX) of the observed drug response achieves high correlation across panels. Using the well-trained models, we impute drug response of The Cancer Genome Atlas data and investigate the features and signatures associated with the imputed drug response, including cell line origins, somatic mutations and tumor mutation burdens, tumor microenvironment, and confounding factors. In summary, our deep learning method and the results are useful for the study of signatures and markers of drug response.
由于肿瘤内和肿瘤间的异质性,癌症患者的药物反应有很大的差异。特别是转录组背景,尤其是肿瘤微环境,已被证明在塑造实际治疗结果方面起着重要作用。在这项研究中,我们开发了一个深度变分自动编码器(VAE)模型,将数千个基因压缩到低维空间中的潜在向量中。然后,我们证明这些编码向量可以准确地推断药物反应,优于基于标准特征基因的方法,并适当控制过拟合问题。我们应用严格的质量评估和验证,包括评估细胞系谱系的影响、交叉验证、跨面板评估以及在独立临床数据集的应用,以保证推断的药物反应在细胞系和癌症样本中的准确性。具体来说,观察到的药物反应的表达调节成分(EReX)在面板之间具有高度相关性。使用经过良好训练的模型,我们推断了癌症基因组图谱数据的药物反应,并研究了与推断的药物反应相关的特征和标记物,包括细胞系起源、体细胞突变和肿瘤突变负担、肿瘤微环境和混杂因素。总之,我们的深度学习方法和结果对于药物反应的特征和标记物的研究是有用的。