Suppr超能文献

基于主动示例选择的集成学习方法在生物医学数据不平衡分类中的应用。

Ensemble learning with active example selection for imbalanced biomedical data classification.

机构信息

WISE Lab., Division of Information and Computer Engineering, Ajou University, Suwon, Kyeonggi 443-749, Korea.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):316-25. doi: 10.1109/TCBB.2010.96.

Abstract

In biomedical data, the imbalanced data problem occurs frequently and causes poor prediction performance for minority classes. It is because the trained classifiers are mostly derived from the majority class. In this paper, we describe an ensemble learning method combined with active example selection to resolve the imbalanced data problem. Our method consists of three key components: 1) an active example selection algorithm to choose informative examples for training the classifier, 2) an ensemble learning method to combine variations of classifiers derived by active example selection, and 3) an incremental learning scheme to speed up the iterative training procedure for active example selection. We evaluate the method on six real-world imbalanced data sets in biomedical domains, showing that the proposed method outperforms both the random under sampling and the ensemble with under sampling methods. Compared to other approaches to solving the imbalanced data problem, our method excels by 0.03-0.15 points in AUC measure.

摘要

在生物医学数据中,不平衡数据问题经常出现,导致少数类别的预测性能较差。这是因为训练有素的分类器主要来自多数类。在本文中,我们描述了一种结合主动示例选择的集成学习方法来解决不平衡数据问题。我们的方法由三个关键组件组成:1)主动示例选择算法,用于选择有信息的示例来训练分类器;2)集成学习方法,用于结合由主动示例选择生成的分类器的变化;3)增量学习方案,用于加速主动示例选择的迭代训练过程。我们在六个生物医学领域的真实不平衡数据集上评估了该方法,表明所提出的方法优于随机欠采样和集成欠采样方法。与其他解决不平衡数据问题的方法相比,我们的方法在 AUC 度量上高出 0.03-0.15 分。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验