Suppr超能文献

基于 E-CNN-SVM 和多源数据融合的骨肉瘤患者生存状态分类模型。

A Survival Status Classification Model for Osteosarcoma Patients Based on E-CNN-SVM and Multisource Data Fusion.

机构信息

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China.

出版信息

Comput Intell Neurosci. 2022 Jul 9;2022:9464182. doi: 10.1155/2022/9464182. eCollection 2022.

Abstract

Traditional algorithms have the following drawbacks: (1) they only focus on a certain aspect of genetic data or local feature data of osteosarcoma patients, and the extracted feature information is not considered as a whole; (2) they do not equalize the sample data between categories; (3) the generalization ability of the model is weak, and it is difficult to perform the task of classifying the survival status of osteosarcoma patients better. In this context, this paper designs a survival status prediction model for osteosarcoma patients based on E-CNN-SVM and multisource data fusion, taking into full consideration the characteristics of the small number of samples, high dimensionality, and interclass imbalance of osteosarcoma patients' genetic data. The model fuses four gene sequencing data highly correlated with bone tumors using the random forest algorithm in a dimensionality reduction and then equalizes the data using a hybrid sampling method combining the SMOTE algorithm and the TomekLink algorithm; secondly, the CNN model with the incentive module is used to further extract features from the data for more accurate extraction of characteristic information; finally, the data are passed to the SVM model to further improve the stability and classification performance of the model. The model has been demonstrated to be more effective in improving the accuracy of the classification of patients with osteosarcoma.

摘要

传统算法存在以下缺点

(1)它们仅关注骨肉瘤患者遗传数据或局部特征数据的某个方面,提取的特征信息没有整体考虑;(2)它们没有对类别之间的样本数据进行均衡化处理;(3)模型的泛化能力较弱,难以更好地执行骨肉瘤患者生存状态分类任务。在此背景下,本文设计了一种基于 E-CNN-SVM 和多源数据融合的骨肉瘤患者生存状态预测模型,充分考虑了骨肉瘤患者遗传数据样本数量少、维度高、类间不平衡的特点。该模型使用随机森林算法融合了与骨肿瘤高度相关的四个基因测序数据,然后使用 SMOTE 算法和 TomekLink 算法相结合的混合抽样方法对数据进行均衡化处理;其次,使用具有激励模块的 CNN 模型进一步从数据中提取特征,以更准确地提取特征信息;最后,将数据传递给 SVM 模型,以进一步提高模型的稳定性和分类性能。该模型在提高骨肉瘤患者分类准确性方面被证明更有效。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8c4/9288314/510aad0cdfcb/CIN2022-9464182.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验