基于主动示例选择的集成学习方法在生物医学数据不平衡分类中的应用。

Ensemble learning with active example selection for imbalanced biomedical data classification.

机构信息

WISE Lab., Division of Information and Computer Engineering, Ajou University, Suwon, Kyeonggi 443-749, Korea.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):316-25. doi: 10.1109/TCBB.2010.96.

DOI:10.1109/TCBB.2010.96

PMID:20876935

Abstract

In biomedical data, the imbalanced data problem occurs frequently and causes poor prediction performance for minority classes. It is because the trained classifiers are mostly derived from the majority class. In this paper, we describe an ensemble learning method combined with active example selection to resolve the imbalanced data problem. Our method consists of three key components: 1) an active example selection algorithm to choose informative examples for training the classifier, 2) an ensemble learning method to combine variations of classifiers derived by active example selection, and 3) an incremental learning scheme to speed up the iterative training procedure for active example selection. We evaluate the method on six real-world imbalanced data sets in biomedical domains, showing that the proposed method outperforms both the random under sampling and the ensemble with under sampling methods. Compared to other approaches to solving the imbalanced data problem, our method excels by 0.03-0.15 points in AUC measure.

摘要

在生物医学数据中，不平衡数据问题经常出现，导致少数类别的预测性能较差。这是因为训练有素的分类器主要来自多数类。在本文中，我们描述了一种结合主动示例选择的集成学习方法来解决不平衡数据问题。我们的方法由三个关键组件组成：1）主动示例选择算法，用于选择有信息的示例来训练分类器；2）集成学习方法，用于结合由主动示例选择生成的分类器的变化；3）增量学习方案，用于加速主动示例选择的迭代训练过程。我们在六个生物医学领域的真实不平衡数据集上评估了该方法，表明所提出的方法优于随机欠采样和集成欠采样方法。与其他解决不平衡数据问题的方法相比，我们的方法在 AUC 度量上高出 0.03-0.15 分。

相似文献

Ensemble learning with active example selection for imbalanced biomedical data classification.基于主动示例选择的集成学习方法在生物医学数据不平衡分类中的应用。

IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):316-25. doi: 10.1109/TCBB.2010.96.

Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy.用于不平衡数据集分类的进化欠采样：提议与分类法

Evol Comput. 2009 Fall;17(3):275-306. doi: 10.1162/evco.2009.17.3.275.

Learning to improve medical decision making from imbalanced data without a priori cost.学习从不均衡数据中改进医疗决策，且无需先验成本。

BMC Med Inform Decis Mak. 2014 Dec 5;14:111. doi: 10.1186/s12911-014-0111-9.

Embedding Undersampling Rotation Forest for Imbalanced Problem.基于欠采样旋转森林的不平衡问题嵌入。

Comput Intell Neurosci. 2018 Nov 1;2018:6798042. doi: 10.1155/2018/6798042. eCollection 2018.

Protein classification with imbalanced data.不均衡数据下的蛋白质分类

Proteins. 2008 Mar;70(4):1125-32. doi: 10.1002/prot.21870.

Class-imbalanced classifiers for high-dimensional data.高维数据的不平衡分类器。

Brief Bioinform. 2013 Jan;14(1):13-26. doi: 10.1093/bib/bbs006. Epub 2012 Mar 9.

Immune centroids oversampling method for binary classification.用于二分类的免疫质心过采样方法。

Comput Intell Neurosci. 2015;2015:109806. doi: 10.1155/2015/109806. Epub 2015 Mar 5.

A novel ensemble machine learning for robust microarray data classification.一种用于稳健微阵列数据分类的新型集成机器学习方法。

Comput Biol Med. 2006 Jun;36(6):553-73. doi: 10.1016/j.compbiomed.2005.04.001. Epub 2005 Jun 23.

Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm.基于集成的自适应过采样方法在计算机辅助微动脉瘤检测中的不平衡数据学习。

Comput Med Imaging Graph. 2017 Jan;55:54-67. doi: 10.1016/j.compmedimag.2016.07.011. Epub 2016 Aug 1.

An active learning based classification strategy for the minority class problem: application to histopathology annotation.基于主动学习的少数类分类策略：在组织病理学标注中的应用。

BMC Bioinformatics. 2011 Oct 28;12:424. doi: 10.1186/1471-2105-12-424.

引用本文的文献

An ensemble learning with active sampling to predict the prognosis of postoperative non-small cell lung cancer patients.基于主动采样的集成学习预测非小细胞肺癌术后患者的预后。

BMC Med Inform Decis Mak. 2022 Sep 19;22(1):245. doi: 10.1186/s12911-022-01960-0.

Risk factor analysis of device-related infections: value of re-sampling method on the real-world imbalanced dataset.器械相关感染的危险因素分析：真实世界不平衡数据集重采样方法的价值。

BMC Med Inform Decis Mak. 2019 Sep 11;19(1):185. doi: 10.1186/s12911-019-0899-4.

Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN.基于自适应多层 ELM 与动态 GAN 结合的生物医学数据不平衡分类。

Biomed Eng Online. 2018 Dec 4;17(1):181. doi: 10.1186/s12938-018-0604-3.

Application of Convolutional Neural Network in the Diagnosis of Jaw Tumors.卷积神经网络在颌骨肿瘤诊断中的应用。

Healthc Inform Res. 2018 Jul;24(3):236-241. doi: 10.4258/hir.2018.24.3.236. Epub 2018 Jul 31.

Imbalanced target prediction with pattern discovery on clinical data repositories.基于临床数据存储库的模式发现进行不平衡目标预测。

BMC Med Inform Decis Mak. 2017 Apr 20;17(1):47. doi: 10.1186/s12911-017-0443-3.

Diagnostic biases in translational bioinformatics.转化生物信息学中的诊断偏差。

BMC Med Genomics. 2015 Aug 1;8:46. doi: 10.1186/s12920-015-0116-y.

Overcome support vector machine diagnosis overfitting.克服支持向量机诊断的过拟合问题。

Cancer Inform. 2014 Dec 9;13(Suppl 1):145-58. doi: 10.4137/CIN.S13875. eCollection 2014.

Learning from data: recognizing glaucomatous defect patterns and detecting progression from visual field measurements.从数据中学习：识别青光眼性缺损模式并通过视野测量检测病情进展。

IEEE Trans Biomed Eng. 2014 Jul;61(7):2112-24. doi: 10.1109/TBME.2014.2314714. Epub 2014 Apr 1.

A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers.用于开发多标志物生物特征面板和集成分类器的计算管道。

BMC Bioinformatics. 2012 Dec 8;13:326. doi: 10.1186/1471-2105-13-326.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于主动示例选择的集成学习方法在生物医学数据不平衡分类中的应用。

Ensemble learning with active example selection for imbalanced biomedical data classification.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献