Suppr超能文献

基于机器学习的乳腺癌远处转移预测模型。

Machine learning-based prediction model for distant metastasis of breast cancer.

机构信息

School of Computer Science and Technology, Hainan University, Haikou, 570228, China.

Beidahuang Industry Group General Hospital, Harbin, 150001, China.

出版信息

Comput Biol Med. 2024 Feb;169:107943. doi: 10.1016/j.compbiomed.2024.107943. Epub 2024 Jan 6.

Abstract

BACKGROUND

Breast cancer is the most prevalent malignancy in women. Advanced breast cancer can develop distant metastases, posing a severe threat to the life of patients. Because the clinical warning signs of distant metastasis are manifested in the late stage of the disease, there is a need for better methods of predicting metastasis.

METHODS

First, we screened breast cancer distant metastasis target genes by performing difference analysis and weighted gene co-expression network analysis (WGCNA) on the selected datasets, and performed analyses such as GO enrichment analysis on these target genes. Secondly, we screened breast cancer distant metastasis target genes by LASSO regression analysis and performed correlation analysis and other analyses on these biomarkers. Finally, we constructed several breast cancer distant metastasis prediction models based on Logistic Regression (LR) model, Random Forest (RF) model, Support Vector Machine (SVM) model, Gradient Boosting Decision Tree (GBDT) model and eXtreme Gradient Boosting (XGBoost) model, and selected the optimal model from them.

RESULTS

Several 21-gene breast cancer distant metastasis prediction models were constructed, with the best performance of the model constructed based on the random forest model. This model accurately predicted the emergence of distant metastases from breast cancer, with an accuracy of 93.6 %, an F1-score of 88.9 % and an AUC value of 91.3 % on the validation set.

CONCLUSION

Our findings have the potential to be translated into a point-of-care prognostic analysis to reduce breast cancer mortality.

摘要

背景

乳腺癌是女性最常见的恶性肿瘤。晚期乳腺癌可能发生远处转移,严重威胁患者生命。由于远处转移的临床预警征象出现在疾病晚期,因此需要更好的转移预测方法。

方法

首先,我们通过对选定的数据集进行差异分析和加权基因共表达网络分析(WGCNA),筛选乳腺癌远处转移靶基因,并对这些靶基因进行 GO 富集分析等分析。其次,我们通过 LASSO 回归分析筛选乳腺癌远处转移靶基因,并对这些生物标志物进行相关性分析等分析。最后,我们基于 Logistic Regression(LR)模型、Random Forest(RF)模型、Support Vector Machine(SVM)模型、Gradient Boosting Decision Tree(GBDT)模型和 eXtreme Gradient Boosting(XGBoost)模型构建了几个乳腺癌远处转移预测模型,并从中选择了最优模型。

结果

构建了多个 21 基因乳腺癌远处转移预测模型,其中基于随机森林模型构建的模型性能最佳。该模型能够准确预测乳腺癌远处转移的发生,在验证集上的准确率为 93.6%,F1 得分为 88.9%,AUC 值为 91.3%。

结论

我们的研究结果有可能转化为即时预后分析,以降低乳腺癌死亡率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验