Suppr超能文献

监督分类中的规则提取和特征消除相结合。

Combined rule extraction and feature elimination in supervised classification.

机构信息

Department of Computer and Information Science, University of Mississippi, University, MS 38677, USA.

出版信息

IEEE Trans Nanobioscience. 2012 Sep;11(3):228-36. doi: 10.1109/TNB.2012.2213264.

Abstract

There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.

摘要

有大量涉及多种数据源组合的生物学相关研究问题,以更好地理解潜在问题。从这些来源中选择和解释最重要的信息非常重要。因此,拥有一个能够同时提取规则和选择特征的好算法,将有助于更好地解释预测模型。我们提出了一种基于 1-范数正则化随机森林的高效算法,即联合规则提取和特征消除(CRF)。CRF 同时提取随机森林生成的少量规则并选择重要特征。我们将 CRF 应用于几个药物活性预测和微阵列数据集。CRF 能够使用少量决策规则产生与最先进的预测算法相当的性能。一些决策规则具有生物学意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/97af/6295448/495af2d2203c/nihms916379f1.jpg

相似文献

3
Robust feature selection for microarray data based on multicriterion fusion.基于多准则融合的微阵列数据稳健特征选择。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):1080-92. doi: 10.1109/TCBB.2010.103.
6
A robust ensemble classification method analysis.一种强大的集成分类方法分析。
Adv Exp Med Biol. 2010;680:149-55. doi: 10.1007/978-1-4419-5913-3_17.
8
Supervised redundant feature detection for tumor classification.用于肿瘤分类的监督冗余特征检测
BMC Med Genomics. 2014;7 Suppl 2(Suppl 2):S5. doi: 10.1186/1755-8794-7-S2-S5. Epub 2014 Oct 22.
9
Decision forest for classification of gene expression data.决策森林用于基因表达数据分类。
Comput Biol Med. 2010 Aug;40(8):698-704. doi: 10.1016/j.compbiomed.2010.06.004. Epub 2010 Jun 29.

本文引用的文献

2
Ongoing and future developments at the Universal Protein Resource.通用蛋白质资源的当前及未来发展情况。
Nucleic Acids Res. 2011 Jan;39(Database issue):D214-9. doi: 10.1093/nar/gkq1020. Epub 2010 Nov 4.
3
Bayesian rule learning for biomedical data mining.贝叶斯规则学习在生物医学数据挖掘中的应用。
Bioinformatics. 2010 Mar 1;26(5):668-75. doi: 10.1093/bioinformatics/btq005. Epub 2010 Jan 14.
5
Penalized feature selection and classification in bioinformatics.生物信息学中的惩罚特征选择与分类
Brief Bioinform. 2008 Sep;9(5):392-403. doi: 10.1093/bib/bbn027. Epub 2008 Jun 18.
10
A review of feature selection techniques in bioinformatics.生物信息学中特征选择技术综述。
Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验