• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 DNA 微阵列数据的多种不平衡癌症类型的识别,使用集成分类器。

Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers.

机构信息

School of Computer Science and Engineering, Jiangsu University of Science and Technology, No. 2 Mengxi Road, Zhenjiang 212003, China.

出版信息

Biomed Res Int. 2013;2013:239628. doi: 10.1155/2013/239628. Epub 2013 Aug 26.

DOI:10.1155/2013/239628
PMID:24078908
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3770038/
Abstract

DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.

摘要

DNA 微阵列技术可以同时测量数以万计的基因的活性,这为在分子水平上诊断癌症提供了一种有效的方法。尽管这种策略引起了广泛的研究关注,但大多数研究都忽略了一个重要问题,即大多数 DNA 微阵列数据集都是偏态的,这导致传统的学习算法产生不准确的结果。一些研究已经考虑到了这个问题,但它们仅仅关注于二分类问题。在本文中,我们通过使用集成学习来处理癌症 DNA 微阵列中遇到的多类不平衡分类问题。我们利用一对一编码策略将多类转化为多个二进制类,每个二进制类都进行特征子空间,这是随机子空间的一个演进版本,它生成多个不同的训练子集。接下来,我们将两种不同的校正技术之一,即决策阈值调整或随机欠采样,引入到每个训练子集中,以减轻类不平衡的影响。具体来说,支持向量机被用作基础分类器,并提出了一种新的投票规则称为反投票,用于做出最终决策。在八个偏态多类癌症微阵列数据集上的实验结果表明,与许多传统的分类方法不同,我们的方法对类不平衡不敏感。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/c921d394675b/BMRI2013-239628.alg.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/830ded26609b/BMRI2013-239628.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/1965ae387918/BMRI2013-239628.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/9c51b609673a/BMRI2013-239628.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/2b08823721bc/BMRI2013-239628.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/8e0368924fb9/BMRI2013-239628.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/5157c7e4df2a/BMRI2013-239628.alg.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/c921d394675b/BMRI2013-239628.alg.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/830ded26609b/BMRI2013-239628.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/1965ae387918/BMRI2013-239628.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/9c51b609673a/BMRI2013-239628.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/2b08823721bc/BMRI2013-239628.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/8e0368924fb9/BMRI2013-239628.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/5157c7e4df2a/BMRI2013-239628.alg.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f659/3770038/c921d394675b/BMRI2013-239628.alg.002.jpg

相似文献

1
Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers.基于 DNA 微阵列数据的多种不平衡癌症类型的识别,使用集成分类器。
Biomed Res Int. 2013;2013:239628. doi: 10.1155/2013/239628. Epub 2013 Aug 26.
2
Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates.基于类别优化基因和概率估计的支持向量机进行多类别癌症分类
J Theor Biol. 2009 Aug 7;259(3):533-40. doi: 10.1016/j.jtbi.2009.04.013. Epub 2009 May 3.
3
Reducing multiclass cancer classification to binary by output coding and SVM.通过输出编码和支持向量机将多类癌症分类简化为二元分类。
Comput Biol Chem. 2006 Feb;30(1):63-71. doi: 10.1016/j.compbiolchem.2005.10.008.
4
A class imbalance-aware Relief algorithm for the classification of tumors using microarray gene expression data.一种基于类别不平衡感知的 Relief 算法,用于使用微阵列基因表达数据进行肿瘤分类。
Comput Biol Chem. 2019 Jun;80:121-127. doi: 10.1016/j.compbiolchem.2019.03.017. Epub 2019 Mar 24.
5
Phenotype recognition with combined features and random subspace classifier ensemble.基于组合特征和随机子空间分类器集成的表型识别。
BMC Bioinformatics. 2011 Apr 30;12:128. doi: 10.1186/1471-2105-12-128.
6
Class-imbalanced classifiers for high-dimensional data.高维数据的不平衡分类器。
Brief Bioinform. 2013 Jan;14(1):13-26. doi: 10.1093/bib/bbs006. Epub 2012 Mar 9.
7
An Improved Ensemble Learning Method for Classifying High-Dimensional and Imbalanced Biomedicine Data.一种用于分类高维不平衡生物医学数据的改进集成学习方法。
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):657-66. doi: 10.1109/TCBB.2014.2306838.
8
Stable feature selection and classification algorithms for multiclass microarray data.用于多类微阵列数据的稳定特征选择和分类算法。
Biol Direct. 2012 Oct 2;7:33. doi: 10.1186/1745-6150-7-33.
9
Multiclass classification of microarray data samples with a reduced number of genes.基于少量基因的微阵列数据样本的多类分类。
BMC Bioinformatics. 2011 Feb 22;12:59. doi: 10.1186/1471-2105-12-59.
10
Iterative ensemble feature selection for multiclass classification of imbalanced microarray data.用于不平衡微阵列数据多类分类的迭代集成特征选择
J Biol Res (Thessalon). 2016 Jul 4;23(Suppl 1):13. doi: 10.1186/s40709-016-0045-8. eCollection 2016 May.

引用本文的文献

1
Algorithm for analyzing randomness in point patterns.点模式随机性分析算法。
MethodsX. 2025 May 8;14:103360. doi: 10.1016/j.mex.2025.103360. eCollection 2025 Jun.
2
Automatic Assignment of Radiology Examination Protocols Using Pre-trained Language Models with Knowledge Distillation.使用带有知识蒸馏的预训练语言模型自动分配放射学检查方案
AMIA Annu Symp Proc. 2022 Feb 21;2021:668-676. eCollection 2021.
3
COVID Mortality Prediction with Machine Learning Methods: A Systematic Review and Critical Appraisal.使用机器学习方法预测 COVID 死亡率:系统评价与批判性评估

本文引用的文献

1
A novel weighted support vector machine based on particle swarm optimization for gene selection and tumor classification.基于粒子群优化的新型加权支持向量机在基因选择和肿瘤分类中的应用。
Comput Math Methods Med. 2012;2012:320698. doi: 10.1155/2012/320698. Epub 2012 Jul 26.
2
Discriminative local subspaces in gene expression data for effective gene function prediction.基于基因表达数据的判别局部子空间用于有效的基因功能预测。
Bioinformatics. 2012 Sep 1;28(17):2256-64. doi: 10.1093/bioinformatics/bts455. Epub 2012 Jul 20.
3
Multiclass microarray data classification based on confidence evaluation.
J Pers Med. 2021 Sep 7;11(9):893. doi: 10.3390/jpm11090893.
4
Feature Selection for High-Dimensional and Imbalanced Biomedical Data Based on Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm.基于稳健相关冗余和二进制沙蝇优化算法的高维不平衡生物医学数据特征选择。
Genes (Basel). 2020 Jun 27;11(7):717. doi: 10.3390/genes11070717.
5
Refinement and validation of the IDIOM score for predicting the risk of gastrointestinal cancer in iron deficiency anaemia.优化和验证 IDIOM 评分,以预测缺铁性贫血患者发生胃肠道癌症的风险。
BMJ Open Gastroenterol. 2020 May;7(1). doi: 10.1136/bmjgast-2020-000403.
6
Particle Swarm Optimized Hybrid Kernel-Based Multiclass Support Vector Machine for Microarray Cancer Data Analysis.基于粒子群优化混合核的多类支持向量机在微阵列癌症数据分析中的应用。
Biomed Res Int. 2019 Dec 14;2019:4085725. doi: 10.1155/2019/4085725. eCollection 2019.
7
Machine Learning and Integrative Analysis of Biomedical Big Data.机器学习与生物医学大数据的综合分析。
Genes (Basel). 2019 Jan 28;10(2):87. doi: 10.3390/genes10020087.
8
De novo pathway-based biomarker identification.基于从头合成途径的生物标志物鉴定。
Nucleic Acids Res. 2017 Sep 19;45(16):e151. doi: 10.1093/nar/gkx642.
9
A robust and accurate method for feature selection and prioritization from multi-class OMICs data.一种用于从多类组学数据中进行特征选择和排序的强大且准确的方法。
PLoS One. 2014 Sep 23;9(9):e107801. doi: 10.1371/journal.pone.0107801. eCollection 2014.
基于置信度评估的多类微阵列数据分类
Genet Mol Res. 2012 May 15;11(2):1357-69. doi: 10.4238/2012.May.15.6.
4
Multiclass Imbalance Problems: Analysis and Potential Solutions.多类不平衡问题:分析与潜在解决方案
IEEE Trans Syst Man Cybern B Cybern. 2012 Aug;42(4):1119-30. doi: 10.1109/TSMCB.2012.2187280. Epub 2012 Mar 16.
5
Class-imbalanced classifiers for high-dimensional data.高维数据的不平衡分类器。
Brief Bioinform. 2013 Jan;14(1):13-26. doi: 10.1093/bib/bbs006. Epub 2012 Mar 9.
6
Combining multiple approaches for gene microarray classification.结合多种方法进行基因微阵列分类。
Bioinformatics. 2012 Apr 15;28(8):1151-7. doi: 10.1093/bioinformatics/bts108. Epub 2012 Mar 5.
7
Interplay between gene expression noise and regulatory network architecture.基因表达噪声与调控网络结构的相互作用。
Trends Genet. 2012 May;28(5):221-32. doi: 10.1016/j.tig.2012.01.006. Epub 2012 Feb 25.
8
Microarray-based cancer prediction using single genes.基于微阵列的单基因癌症预测。
BMC Bioinformatics. 2011 Oct 7;12:391. doi: 10.1186/1471-2105-12-391.
9
The role of gene expression profiling in drug discovery.基因表达谱分析在药物发现中的作用。
Curr Opin Pharmacol. 2011 Oct;11(5):549-56. doi: 10.1016/j.coph.2011.06.009.
10
Class prediction for high-dimensional class-imbalanced data.高维类别不平衡数据的类别预测。
BMC Bioinformatics. 2010 Oct 20;11:523. doi: 10.1186/1471-2105-11-523.