Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.
School of Biological Sciences, Institute for Research in Fundamental Sciences(IPM), Tehran, Iran.
J Bioinform Comput Biol. 2022 Apr;20(2):2150035. doi: 10.1142/S0219720021500359. Epub 2021 Dec 17.
Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity ( 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model's performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF's logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.
使用大量抗癌药物的癌细胞系药物反应值来预测肿瘤药物反应是个性化医疗中的一个重大挑战。通过获取来自临床前模型的数据,可以更容易地预测患者对药物的反应。本文提出了 TCLMF 方法,这是一种基于逻辑矩阵分解方法的预测肿瘤样本药物反应的预测模型,是在临床前样本上进行训练的。TCLMF 模型是基于基因表达谱、组织类型信息、药物的化学结构和来自癌细胞系的药物敏感性数据(50)设计的。我们使用来自癌症基因组图谱(TCGA)数据集的临床前数据来训练所提出的药物反应模型,然后使用该模型来预测来自癌症基因组图谱(TCGA)数据集的样本的药物敏感性。TCLMF 方法专注于识别细胞系和药物的成功特征,以便计算肿瘤样本对药物敏感的概率。在这项研究中,使用肿瘤样本和细胞系之间的相似性描述来计算每个肿瘤样本的最近细胞系邻居。然后通过平均其邻近细胞系获得的低秩特征来计算新肿瘤的药物反应。我们使用两种数据库和两种方法来比较 TCLMF 模型与之前提出的方法的结果,以测试模型的性能。在第一种方法中,研究了之前方法中考虑的 12 种具有足够已知临床药物反应的药物。对于 12 种药物中的 7 种,TCLMF 可以显著区分对这些药物有抗药性的患者和对这些药物敏感的患者。在第二种方法中,将这些方法转换为分类模型,并比较结果。结果表明,TCLMF 方法在其他算法的结果上提供了准确的预测。最后,我们使用 TCLMF 的逻辑矩阵分解过程获得的潜在向量准确地对肿瘤组织类型进行分类。这些发现表明,TCLMF 方法为肿瘤样本生成了有效的潜在向量。TCLMF 方法的源代码可在 https://github.com/emdadi/TCLMF 上获得。