Liang Xin, Zhu Wen, Liao Bo, Wang Bo, Yang Jialiang, Mo Xiaofei, Li Ruixi
Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.
Key Laboratory of Data Science and Intelligence Education, Ministry of Education, Hainan Normal University, Haikou, China.
Front Bioeng Biotechnol. 2020 Nov 24;8:607126. doi: 10.3389/fbioe.2020.607126. eCollection 2020.
Some carcinomas show that one or more metastatic sites appear with unknown origins. The identification of primary or metastatic tumor tissues is crucial for physicians to develop precise treatment plans for patients. With unknown primary origin sites, it is challenging to design specific plans for patients. Usually, those patients receive broad-spectrum chemotherapy, while still having poor prognosis though. Machine learning has been widely used and already achieved significant advantages in clinical practices. In this study, we classify and predict a large number of tumor samples with uncertain origins by applying the random forest and Naive Bayesian algorithms. We use the precision, recall, and other measurements to evaluate the performance of our approach. The results have showed that the prediction accuracy of this method was 90.4 for 7,713 samples. The accuracy was 80% for 20 metastatic tumors samples. In addition, the 10-fold cross-validation is used to evaluate the accuracy of classification, which reaches 91%.
一些癌症显示出一个或多个转移部位的起源不明。确定原发性或转移性肿瘤组织对于医生为患者制定精确的治疗方案至关重要。由于原发性起源部位不明,为患者设计具体方案具有挑战性。通常,这些患者接受广谱化疗,但其预后仍然很差。机器学习已被广泛应用,并在临床实践中取得了显著优势。在本研究中,我们应用随机森林和朴素贝叶斯算法对大量起源不明的肿瘤样本进行分类和预测。我们使用精确率、召回率和其他指标来评估我们方法的性能。结果表明,该方法对7713个样本的预测准确率为90.4%。对20个转移性肿瘤样本的准确率为80%。此外,使用10折交叉验证来评估分类的准确性,其达到了91%。