Hsu Yu-Ching, Chiu Yu-Chiao, Lu Tzu-Pin, Hsiao Tzu-Hung, Chen Yidong
Bioinformatics Program, Taiwan International Graduate Program, National Taiwan University, Taipei 115, Taiwan.
Bioinformatics Program, Institute of Statistical Science, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan.
Patterns (N Y). 2024 Mar 5;5(4):100949. doi: 10.1016/j.patter.2024.100949. eCollection 2024 Apr 12.
Large-scale cancer drug sensitivity data have become available for a collection of cancer cell lines, but only limited drug response data from patients are available. Bridging the gap in pharmacogenomics knowledge between and datasets remains challenging. In this study, we trained a deep learning model, Scaden-CA, for deconvoluting tumor data into proportions of cancer-type-specific cell lines. Then, we developed a drug response prediction method using the deconvoluted proportions and the drug sensitivity data from cell lines. The Scaden-CA model showed excellent performance in terms of concordance correlation coefficients (>0.9 for model testing) and the correctly deconvoluted rate (>70% across most cancers) for model validation using Cancer Cell Line Encyclopedia (CCLE) bulk RNA data. We applied the model to tumors in The Cancer Genome Atlas (TCGA) dataset and examined associations between predicted cell viability and mutation status or gene expression levels to understand underlying mechanisms of potential value for drug repurposing.
大规模的癌症药物敏感性数据已可用于一组癌细胞系,但来自患者的药物反应数据却非常有限。弥合癌细胞系数据集与患者数据集之间药物基因组学知识的差距仍然具有挑战性。在本研究中,我们训练了一个深度学习模型Scaden-CA,用于将肿瘤数据解卷积为癌症类型特异性细胞系的比例。然后,我们利用解卷积后的比例和细胞系的药物敏感性数据开发了一种药物反应预测方法。使用癌症细胞系百科全书(CCLE)批量RNA数据进行模型验证时,Scaden-CA模型在一致性相关系数(模型测试>0.9)和解卷积正确率(大多数癌症中>70%)方面表现出色。我们将该模型应用于癌症基因组图谱(TCGA)数据集中的肿瘤,并检查预测的细胞活力与突变状态或基因表达水平之间的关联,以了解药物重新利用潜在价值的潜在机制。