Department of Pathology, Yanbian University Medical College, Yanji, P.R. China.
Oral Cancer Research Institute, College of Dentistry, Yonsei University, Seoul, Republic of Korea.
Anticancer Res. 2021 May;41(5):2419-2429. doi: 10.21873/anticanres.15017.
BACKGROUND/AIM: Many cancer patients face multiple primary cancers. It is challenging to find an anticancer therapy that covers both cancer types in such patients. In personalized medicine, drug response is predicted using genomic information, which makes it possible to choose the most effective therapy for these cancer patients. The aim of this study was to identify chemosensitive gene sets and compare the predictive accuracy of response of cancer cell lines to drug treatment, based on both the genomic features of cell lines and cancer types.
In this study, we identified a gene set that is sensitive to a specific therapeutic drug, and compared the performance of several predictive models using the identified genes and cancer types through machine learning (ML). To this end, publicly available gene expression datasets and drug sensitivity datasets of gastric and pancreatic cancers were used. Five ML algorithms, including linear discriminant analysis, classification and regression tree, k-nearest neighbors, support vector machine and random forest, were implemented.
The predictive accuracy of the cancer type models were 0.729 to 0.763 on the training dataset and 0.731 to 0.765 on the testing dataset. The predictive accuracy of the genomic prediction models was 0.818 to 1.0 on the training dataset and 0.759 to 0.896 on the testing dataset.
Performance of the specific gene models was much better than those of the cancer type models using the ML methods. Therofore, the most effective therapeutic drug can be chosen based on the expression of specific genes in patients with multiple primary cancers, regardless of cancer types.
背景/目的:许多癌症患者面临多种原发性癌症。在这些患者中找到涵盖两种癌症类型的抗癌治疗方法具有挑战性。在个性化医学中,使用基因组信息预测药物反应,这使得为这些癌症患者选择最有效的治疗方法成为可能。本研究的目的是基于细胞系和癌症类型的基因组特征,确定化疗敏感基因集,并比较基于基因组特征和癌症类型的癌细胞系对药物治疗反应的预测准确性。
在这项研究中,我们确定了对特定治疗药物敏感的基因集,并通过机器学习(ML)使用鉴定的基因和癌症类型比较了几种预测模型的性能。为此,使用了公开的基因表达数据集和胃癌和胰腺癌的药物敏感性数据集。实施了五种 ML 算法,包括线性判别分析、分类回归树、k-最近邻、支持向量机和随机森林。
在训练数据集上,癌症类型模型的预测准确性为 0.729 至 0.763,在测试数据集上为 0.731 至 0.765。基因组预测模型的预测准确性在训练数据集上为 0.818 至 1.0,在测试数据集上为 0.759 至 0.896。
使用 ML 方法,特定基因模型的性能明显优于癌症类型模型。因此,无论癌症类型如何,都可以根据多种原发性癌症患者特定基因的表达来选择最有效的治疗药物。