• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于微阵列和高维数据的高效癌症分类模型。

An Efficient Cancer Classification Model Using Microarray and High-Dimensional Data.

机构信息

Mathematics and Computer Science Department, Faculty of Science, Menoufia University, Al Minufya, Egypt.

Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia.

出版信息

Comput Intell Neurosci. 2021 Dec 29;2021:7231126. doi: 10.1155/2021/7231126. eCollection 2021.

DOI:10.1155/2021/7231126
PMID:35003246
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8731276/
Abstract

Cancer can be considered as one of the leading causes of death widely. One of the most effective tools to be able to handle cancer diagnosis, prognosis, and treatment is by using expression profiling technique which is based on microarray gene. For each data point (sample), gene data expression usually receives tens of thousands of genes. As a result, this data is large-scale, high-dimensional, and highly redundant. The classification of gene expression profiles is considered to be a (NP)-Hard problem. Feature (gene) selection is one of the most effective methods to handle this problem. A hybrid cancer classification approach is presented in this paper, and several machine learning techniques were used in the hybrid model: Pearson's correlation coefficient as a correlation-based feature selector and reducer, a Decision Tree classifier that is easy to interpret and does not require a parameter, and Grid Search CV (cross-validation) to optimize the maximum depth hyperparameter. Seven standard microarray cancer datasets are used to evaluate our model. To identify which features are the most informative and relative using the proposed model, various performance measurements are employed, including classification accuracy, specificity, sensitivity, 1-score, and AUC. The suggested strategy greatly decreases the number of genes required for classification, selects the most informative features, and increases classification accuracy, according to the results.

摘要

癌症被广泛认为是主要死因之一。为了能够进行癌症诊断、预后和治疗,使用基于微阵列基因的表达谱技术是最有效的工具之一。对于每个数据点(样本),基因数据表达通常会接收数万种基因。因此,这些数据具有大规模、高维度和高度冗余的特点。基因表达谱的分类被认为是一个(NP)-Hard 问题。特征(基因)选择是处理这个问题的最有效方法之一。本文提出了一种混合癌症分类方法,并在混合模型中使用了几种机器学习技术:Pearson 相关系数作为基于相关性的特征选择器和降维器、易于解释且不需要参数的决策树分类器,以及网格搜索 CV(交叉验证)来优化最大深度超参数。使用七种标准的微阵列癌症数据集来评估我们的模型。为了使用所提出的模型识别哪些特征是最具信息量和相关性的,使用了各种性能测量方法,包括分类准确性、特异性、敏感性、1 分数和 AUC。根据结果,该策略大大减少了分类所需的基因数量,选择了最具信息量的特征,并提高了分类准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/4d3b57766517/CIN2021-7231126.013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/d32e40e8c8e1/CIN2021-7231126.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/ec2ab715cfd2/CIN2021-7231126.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/d5f20b15560d/CIN2021-7231126.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/b97bf1552883/CIN2021-7231126.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/59caa8fc2f46/CIN2021-7231126.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/3be90be5ea84/CIN2021-7231126.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/9d074467059f/CIN2021-7231126.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/28179f956fee/CIN2021-7231126.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/8cd8b05fdd36/CIN2021-7231126.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/e17cc4104642/CIN2021-7231126.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/34a410cfc7f7/CIN2021-7231126.011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/e52fb0302366/CIN2021-7231126.012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/4d3b57766517/CIN2021-7231126.013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/d32e40e8c8e1/CIN2021-7231126.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/ec2ab715cfd2/CIN2021-7231126.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/d5f20b15560d/CIN2021-7231126.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/b97bf1552883/CIN2021-7231126.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/59caa8fc2f46/CIN2021-7231126.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/3be90be5ea84/CIN2021-7231126.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/9d074467059f/CIN2021-7231126.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/28179f956fee/CIN2021-7231126.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/8cd8b05fdd36/CIN2021-7231126.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/e17cc4104642/CIN2021-7231126.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/34a410cfc7f7/CIN2021-7231126.011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/e52fb0302366/CIN2021-7231126.012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a87/8731276/4d3b57766517/CIN2021-7231126.013.jpg

相似文献

1
An Efficient Cancer Classification Model Using Microarray and High-Dimensional Data.基于微阵列和高维数据的高效癌症分类模型。
Comput Intell Neurosci. 2021 Dec 29;2021:7231126. doi: 10.1155/2021/7231126. eCollection 2021.
2
Hybrid Feature Selection Algorithm mRMR-ICA for Cancer Classification from Microarray Gene Expression Data.用于从微阵列基因表达数据进行癌症分类的混合特征选择算法mRMR-ICA
Comb Chem High Throughput Screen. 2018;21(6):420-430. doi: 10.2174/1386207321666180601074349.
3
The feature selection bias problem in relation to high-dimensional gene data.与高维基因数据相关的特征选择偏差问题。
Artif Intell Med. 2016 Jan;66:63-71. doi: 10.1016/j.artmed.2015.11.001. Epub 2015 Nov 14.
4
Improved intelligent water drop-based hybrid feature selection method for microarray data processing.基于智能水滴的改进型混合特征选择方法在微阵列数据处理中的应用。
Comput Biol Chem. 2023 Apr;103:107809. doi: 10.1016/j.compbiolchem.2022.107809. Epub 2023 Jan 13.
5
Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction.机器学习中特征选择的最佳评分对及其在癌症预后预测中的应用。
BMC Bioinformatics. 2011 Sep 23;12:375. doi: 10.1186/1471-2105-12-375.
6
An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples.一种基于多重过滤和监督属性聚类算法的集成机器学习模型,用于对癌症样本进行分类。
PeerJ Comput Sci. 2021 Sep 16;7:e671. doi: 10.7717/peerj-cs.671. eCollection 2021.
7
Genetic algorithm-based feature selection with manifold learning for cancer classification using microarray data.基于遗传算法的特征选择与流形学习在基于微阵列数据的癌症分类中的应用。
BMC Bioinformatics. 2023 Apr 8;24(1):139. doi: 10.1186/s12859-023-05267-3.
8
A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection.一种新颖且具有创新性的癌症分类框架,通过连续利用混合特征选择实现。
BMC Bioinformatics. 2023 Dec 15;24(1):479. doi: 10.1186/s12859-023-05605-5.
9
Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.用于癌症微阵列数据分类的分层基因选择与遗传模糊系统
PLoS One. 2015 Mar 30;10(3):e0120364. doi: 10.1371/journal.pone.0120364. eCollection 2015.
10
C-HMOSHSSA: Gene selection for cancer classification using multi-objective meta-heuristic and machine learning methods.C-HMOSHSSA:使用多目标元启发式和机器学习方法进行癌症分类的基因选择。
Comput Methods Programs Biomed. 2019 Sep;178:219-235. doi: 10.1016/j.cmpb.2019.06.029. Epub 2019 Jun 29.

引用本文的文献

1
Deep learning assisted cancer disease prediction from gene expression data using WT-GAN.深度学习辅助 WT-GAN 从基因表达数据预测癌症疾病。
BMC Med Inform Decis Mak. 2024 Oct 24;24(1):311. doi: 10.1186/s12911-024-02712-y.
2
Performance enhancement of classifiers through Bio inspired feature selection methods for early detection of lung cancer from microarray genes.通过受生物启发的特征选择方法提高分类器性能,用于从微阵列基因中早期检测肺癌。
Heliyon. 2024 Aug 17;10(16):e36419. doi: 10.1016/j.heliyon.2024.e36419. eCollection 2024 Aug 30.
3
Enhancement of Classifier Performance with Adam and RanAdam Hyper-Parameter Tuning for Lung Cancer Detection from Microarray Data-In Pursuit of Precision.

本文引用的文献

1
Lightning search algorithm: a comprehensive survey.闪电搜索算法:全面综述。
Appl Intell (Dordr). 2021;51(4):2353-2376. doi: 10.1007/s10489-020-01947-2. Epub 2020 Nov 3.
2
BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm.BCD-WERT:一种基于鲸鱼优化算法的高效特征和极端随机树算法用于乳腺癌检测的新方法。
PeerJ Comput Sci. 2021 Mar 12;7:e390. doi: 10.7717/peerj-cs.390. eCollection 2021.
3
Cancer Statistics, 2021.癌症统计数据,2021.
通过Adam和RanAdam超参数调优提高从微阵列数据检测肺癌的分类器性能——追求精准度
Bioengineering (Basel). 2024 Mar 26;11(4):314. doi: 10.3390/bioengineering11040314.
4
A Novel Artificial Electric Field Algorithm for Solving Global Optimization and Real-World Engineering Problems.一种求解全局优化和实际工程问题的新型人工电场算法。
Biomimetics (Basel). 2024 Mar 19;9(3):186. doi: 10.3390/biomimetics9030186.
5
LRP5, SLC6A3, and SOX10 Expression in Conventional Ameloblastoma.LRP5、SLC6A3 和 SOX10 在常规成釉细胞瘤中的表达。
Genes (Basel). 2023 Jul 26;14(8):1524. doi: 10.3390/genes14081524.
6
Evaluation and Exploration of Machine Learning and Convolutional Neural Network Classifiers in Detection of Lung Cancer from Microarray Gene-A Paradigm Shift.机器学习和卷积神经网络分类器在微阵列基因检测肺癌中的评估与探索——一种范式转变
Bioengineering (Basel). 2023 Aug 6;10(8):933. doi: 10.3390/bioengineering10080933.
7
Gaussian Blurring Technique for Detecting and Classifying Acute Lymphoblastic Leukemia Cancer Cells from Microscopic Biopsy Images.用于从显微活检图像中检测和分类急性淋巴细胞白血病癌细胞的高斯模糊技术
Life (Basel). 2023 Jan 28;13(2):348. doi: 10.3390/life13020348.
CA Cancer J Clin. 2021 Jan;71(1):7-33. doi: 10.3322/caac.21654. Epub 2021 Jan 12.
4
Detecting biomarkers from microarray data using distributed correlation based gene selection.基于分布式相关的基因选择从微阵列数据中检测生物标志物。
Genes Genomics. 2020 Apr;42(4):449-465. doi: 10.1007/s13258-020-00916-w. Epub 2020 Feb 10.
5
Diagnosis and classification of cancer using hybrid model based on ReliefF and convolutional neural network.基于ReliefF和卷积神经网络的混合模型用于癌症的诊断与分类
Med Hypotheses. 2020 Apr;137:109577. doi: 10.1016/j.mehy.2020.109577. Epub 2020 Jan 20.
6
Efficient feature selection and classification for microarray data.高效的微阵列数据分析中的特征选择与分类。
PLoS One. 2018 Aug 20;13(8):e0202167. doi: 10.1371/journal.pone.0202167. eCollection 2018.
7
Microarray experiments and factors which affect their reliability.微阵列实验及其影响可靠性的因素。
Biol Direct. 2015 Sep 3;10:46. doi: 10.1186/s13062-015-0077-2.
8
Gene selection using iterative feature elimination random forests for survival outcomes.基于迭代特征消除随机森林的生存结局基因选择。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1422-31. doi: 10.1109/TCBB.2012.63.
9
A survey on filter techniques for feature selection in gene expression microarray analysis.基因表达微阵列分析中特征选择的过滤技术调查。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1106-19. doi: 10.1109/TCBB.2012.33.
10
Microarrays for cancer diagnosis and classification.用于癌症诊断和分类的微阵列
Adv Exp Med Biol. 2007;593:74-85. doi: 10.1007/978-0-387-39978-2_8.