Mo Liying, Su Yuangang, Yuan Jianhui, Xiao Zhiwei, Zhang Ziyan, Lan Xiuwan, Huang Daizheng
School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China.
Research Centre for Regenerative Medicine, Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, Guangxi, China.
Curr Genomics. 2022 Jun 10;23(2):94-108. doi: 10.2174/1389202923666220204153744.
Machine learning methods showed excellent predictive ability in a wide range of fields. For the survival of head and neck squamous cell carcinoma (HNSC), its multi-omics influence is crucial. This study attempts to establish a variety of machine learning multi-omics models to predict the survival of HNSC and find the most suitable machine learning prediction method. The HNSC clinical data and multi-omics data were downloaded from the TCGA database. The important variables were screened by the LASSO algorithm. We used a total of 12 supervised machine learning models to predict the outcome of HNSC survival and compared the results. qPCR was performed to verify core genes predicted by the random forest algorithm. For omics of HNSC, the results of the twelve models showed that the performance of multi-omics was better than each single-omic alone. Results were presented, which showed that the Bayesian network(BN) model (area under the curve [AUC] 0.8250, F1 score=0.7917) and random forest(RF) model (area under the curve [AUC] 0.8002,F1 score=0.7839) played good prediction performance in HNSC multi-omics data. The results of qPCR were consistent with the RF algorithm. Machine learning methods could better forecast the survival outcome of HNSC. Meanwhile, this study found that the BN model and the RF model were the most superior. Moreover, the forecast result of multi-omics was better than single-omic alone in HNSC.
机器学习方法在广泛的领域中显示出优异的预测能力。对于头颈部鳞状细胞癌(HNSC)的生存而言,其多组学影响至关重要。本研究试图建立多种机器学习多组学模型来预测HNSC的生存情况,并找到最合适的机器学习预测方法。HNSC临床数据和多组学数据从TCGA数据库下载。通过LASSO算法筛选重要变量。我们总共使用12种监督机器学习模型来预测HNSC生存结果并比较结果。进行qPCR以验证随机森林算法预测的核心基因。对于HNSC的组学,十二个模型的结果表明多组学的性能优于每个单独的单一组学。给出的结果表明,贝叶斯网络(BN)模型(曲线下面积[AUC]0.8250,F1分数=0.7917)和随机森林(RF)模型(曲线下面积[AUC]0.8002,F1分数=0.7839)在HNSC多组学数据中表现出良好的预测性能。qPCR的结果与RF算法一致。机器学习方法可以更好地预测HNSC的生存结果。同时,本研究发现BN模型和RF模型是最优越的。此外,在HNSC中多组学的预测结果优于单独的单一组学。
BMC Med Inform Decis Mak. 2024-5-2
J Biol Regul Homeost Agents. 2021
Brief Bioinform. 2021-5-20
BMC Bioinformatics. 2019-6-27
Comput Biol Med. 2021-7
Nutrients. 2020-8-31
Curr Drug Discov Technol. 2021