Suppr超能文献

基于迭代特征消除随机森林的生存结局基因选择。

Gene selection using iterative feature elimination random forests for survival outcomes.

机构信息

Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27705, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1422-31. doi: 10.1109/TCBB.2012.63.

Abstract

Although many feature selection methods for classification have been developed, there is a need to identify genes in high-dimensional data with censored survival outcomes. Traditional methods for gene selection in classification problems have several drawbacks. First, the majority of the gene selection approaches for classification are single-gene based. Second, many of the gene selection procedures are not embedded within the algorithm itself. The technique of random forests has been found to perform well in high-dimensional data settings with survival outcomes. It also has an embedded feature to identify variables of importance. Therefore, it is an ideal candidate for gene selection in high-dimensional data with survival outcomes. In this paper, we develop a novel method based on the random forests to identify a set of prognostic genes. We compare our method with several machine learning methods and various node split criteria using several real data sets. Our method performed well in both simulations and real data analysis.Additionally, we have shown the advantages of our approach over single-gene-based approaches. Our method incorporates multivariate correlations in microarray data for survival outcomes. The described method allows us to better utilize the information available from microarray data with survival outcomes.

摘要

尽管已经开发出许多用于分类的特征选择方法,但仍需要识别出具有删失生存结局的高维数据中的基因。传统的分类问题基因选择方法存在几个缺点。首先,大多数分类基因选择方法都是基于单基因的。其次,许多基因选择过程并没有嵌入到算法本身中。随机森林技术已被发现可在具有生存结局的高维数据环境中表现良好。它还有一个嵌入式功能来识别重要变量。因此,它是高维数据中具有生存结局的基因选择的理想候选者。在本文中,我们基于随机森林开发了一种新的方法来识别一组预后基因。我们使用几个真实数据集将我们的方法与几种机器学习方法和各种节点分裂标准进行了比较。我们的方法在模拟和真实数据分析中都表现良好。此外,我们还展示了我们的方法相对于基于单基因的方法的优势。我们的方法将生存结局的微阵列数据中的多变量相关性纳入其中。所描述的方法允许我们更好地利用具有生存结局的微阵列数据中的可用信息。

相似文献

1
Gene selection using iterative feature elimination random forests for survival outcomes.
IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1422-31. doi: 10.1109/TCBB.2012.63.
2
Pathway analysis using random forests with bivariate node-split for survival outcomes.
Bioinformatics. 2010 Jan 15;26(2):250-8. doi: 10.1093/bioinformatics/btp640. Epub 2009 Nov 18.
3
Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data.
BMC Bioinformatics. 2006 Apr 10;7:197. doi: 10.1186/1471-2105-7-197.
4
Gene selection and classification of microarray data using random forest.
BMC Bioinformatics. 2006 Jan 6;7:3. doi: 10.1186/1471-2105-7-3.
6
Robust feature selection for microarray data based on multicriterion fusion.
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):1080-92. doi: 10.1109/TCBB.2010.103.
7
MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data.
Bioinformatics. 2007 May 1;23(9):1106-14. doi: 10.1093/bioinformatics/btm036.
8
Minimum number of genes for microarray feature selection.
Annu Int Conf IEEE Eng Med Biol Soc. 2008;2008:5692-5. doi: 10.1109/IEMBS.2008.4650506.
9
Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis.
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jul-Sep;4(3):365-81. doi: 10.1109/TCBB.2007.70224.
10
A novel feature selection approach for biomedical data classification.
J Biomed Inform. 2010 Feb;43(1):15-23. doi: 10.1016/j.jbi.2009.07.008. Epub 2009 Jul 30.

引用本文的文献

1
Improved nonparametric survival prediction using CoxPH, Random Survival Forest & DeepHit Neural Network.
BMC Med Inform Decis Mak. 2024 May 7;24(1):120. doi: 10.1186/s12911-024-02525-z.
2
Rationally designed probiotics prevent shrimp white feces syndrome via the probiotics-gut microbiome-immunity axis.
NPJ Biofilms Microbiomes. 2024 Apr 11;10(1):40. doi: 10.1038/s41522-024-00509-5.
3
A network approach for low dimensional signatures from high throughput data.
Sci Rep. 2022 Dec 23;12(1):22253. doi: 10.1038/s41598-022-25549-9.
4
Random survival forest model identifies novel biomarkers of event-free survival in high-risk pediatric acute lymphoblastic leukemia.
Comput Struct Biotechnol J. 2022 Jan 6;20:583-597. doi: 10.1016/j.csbj.2022.01.003. eCollection 2022.
5
An Efficient Cancer Classification Model Using Microarray and High-Dimensional Data.
Comput Intell Neurosci. 2021 Dec 29;2021:7231126. doi: 10.1155/2021/7231126. eCollection 2021.
7
Detecting biomarkers from microarray data using distributed correlation based gene selection.
Genes Genomics. 2020 Apr;42(4):449-465. doi: 10.1007/s13258-020-00916-w. Epub 2020 Feb 10.
8
Multiplatform biomarker identification using a data-driven approach enables single-sample classification.
BMC Bioinformatics. 2019 Nov 21;20(1):601. doi: 10.1186/s12859-019-3140-7.
9
A Selective Review on Random Survival Forests for High Dimensional Data.
Quant Biosci. 2017;36(2):85-96. doi: 10.22283/qbs.2017.36.2.85.

本文引用的文献

1
PRDM1 is required for mantle cell lymphoma response to bortezomib.
Mol Cancer Res. 2010 Jun;8(6):907-18. doi: 10.1158/1541-7786.MCR-10-0131. Epub 2010 Jun 8.
2
Recursive Mahalanobis separability measure for gene subset selection.
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):266-72. doi: 10.1109/TCBB.2010.43.
3
Improving the computational efficiency of recursive cluster elimination for gene selection.
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):122-9. doi: 10.1109/TCBB.2010.44.
4
Gene selection in microarray survival studies under possibly non-proportional hazards.
Bioinformatics. 2010 Mar 15;26(6):784-90. doi: 10.1093/bioinformatics/btq035. Epub 2010 Jan 29.
5
Survival prediction from clinico-genomic models--a comparative study.
BMC Bioinformatics. 2009 Dec 13;10:413. doi: 10.1186/1471-2105-10-413.
7
Pathway analysis using random forests with bivariate node-split for survival outcomes.
Bioinformatics. 2010 Jan 15;26(2):250-8. doi: 10.1093/bioinformatics/btp640. Epub 2009 Nov 18.
8
SVM-RFE with MRMR filter for gene selection.
IEEE Trans Nanobioscience. 2010 Mar;9(1):31-7. doi: 10.1109/TNB.2009.2035284. Epub 2009 Oct 30.
9
Laplacian linear discriminant analysis approach to unsupervised feature selection.
IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):605-14. doi: 10.1109/TCBB.2007.70257.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验