探索基因表达微阵列数据中的相关性，以进行最大预测-最小冗余生物标志物选择和分类。

Exploring correlations in gene expression microarray data for maximum predictive-minimum redundancy biomarker selection and classification.

机构信息

Department of Statistics, Operational Research and Numerical Analysis, University Nacional Educación a Distancia (UNED), Paseo Senda del Rey 9, 28040 Madrid, Spain.

出版信息

Comput Biol Med. 2013 Oct;43(10):1437-43. doi: 10.1016/j.compbiomed.2013.07.005. Epub 2013 Jul 13.

DOI:10.1016/j.compbiomed.2013.07.005

PMID:24034735

Abstract

An important issue in the analysis of gene expression microarray data is concerned with the extraction of valuable genetic interactions from high dimensional data sets containing gene expression levels collected for a small sample of assays. Past and ongoing research efforts have been focused on biomarker selection for phenotype classification. Usually, many genes convey useless information for classifying the outcome and should be removed from the analysis; on the other hand, some of them may be highly correlated, which reveals the presence of redundant expressed information. In this paper we propose a method for the selection of highly predictive genes having a low redundancy in their expression levels. The predictive accuracy of the selection is assessed by means of Classification and Regression Trees (CART) models which enable assessment of the performance of the selected genes for classifying the outcome variable and will also uncover complex genetic interactions. The method is illustrated throughout the paper using a public domain colon cancer gene expression data set.

摘要

基因表达微阵列数据分析中的一个重要问题涉及从包含针对小样本测定收集的基因表达水平的高维数据集提取有价值的遗传相互作用。过去和正在进行的研究工作都集中在生物标志物的选择用于表型分类。通常，许多基因对于分类结果传递无用的信息，应该从分析中删除;另一方面，其中一些可能高度相关，这表明存在冗余表达的信息。在本文中，我们提出了一种从其表达水平中具有低冗余度的高度预测基因中选择的方法。通过分类和回归树 (CART) 模型评估选择的预测准确性，该模型能够评估所选基因用于分类结果变量的性能，并揭示复杂的遗传相互作用。该方法使用公共领域的结肠癌基因表达数据集在整篇文章中进行说明。

相似文献

Exploring correlations in gene expression microarray data for maximum predictive-minimum redundancy biomarker selection and classification.探索基因表达微阵列数据中的相关性，以进行最大预测-最小冗余生物标志物选择和分类。

Comput Biol Med. 2013 Oct;43(10):1437-43. doi: 10.1016/j.compbiomed.2013.07.005. Epub 2013 Jul 13.

Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery.微阵列转录数据中存在许多准确的小判别特征子集：生物标志物发现。

BMC Bioinformatics. 2005 Apr 13;6:97. doi: 10.1186/1471-2105-6-97.

Ensemble gene selection by grouping for microarray data classification.基于分组的微阵列数据分类的集成基因选择。

J Biomed Inform. 2010 Feb;43(1):81-7. doi: 10.1016/j.jbi.2009.08.010. Epub 2009 Aug 20.

Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers.从微阵列数据中选择最少数量的相关基因以设计精确的组织分类器。

Biosystems. 2007 Jul-Aug;90(1):78-86. doi: 10.1016/j.biosystems.2006.07.002. Epub 2006 Jul 10.

Tumor classification ranking from microarray data.基于微阵列数据的肿瘤分类排名

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

Cancer classification and prediction using logistic regression with Bayesian gene selection.使用贝叶斯基因选择的逻辑回归进行癌症分类和预测。

J Biomed Inform. 2004 Aug;37(4):249-59. doi: 10.1016/j.jbi.2004.07.009.

A gene selection method for classifying cancer samples using 1D discrete wavelet transform.一种使用一维离散小波变换对癌症样本进行分类的基因选择方法。

Int J Comput Biol Drug Des. 2009;2(4):398-411. doi: 10.1504/IJCBDD.2009.030769. Epub 2009 Jan 4.

Pathway analysis using random forests classification and regression.使用随机森林分类和回归的通路分析

Bioinformatics. 2006 Aug 15;22(16):2028-36. doi: 10.1093/bioinformatics/btl344. Epub 2006 Jun 29.

The feature selection bias problem in relation to high-dimensional gene data.与高维基因数据相关的特征选择偏差问题。

Artif Intell Med. 2016 Jan;66:63-71. doi: 10.1016/j.artmed.2015.11.001. Epub 2015 Nov 14.

Chaotic genetic algorithm for gene selection and classification problems.用于基因选择与分类问题的混沌遗传算法。

OMICS. 2009 Oct;13(5):407-20. doi: 10.1089/omi.2009.0007.

引用本文的文献

Utilizing multimodal radiomics technology from cervical MRI for diagnosis of cervical spinal cord injury and spinal cord concussion.利用颈 MRI 的多模态放射组学技术诊断颈脊髓损伤和脊髓震荡。

Sci Rep. 2024 Aug 12;14(1):18686. doi: 10.1038/s41598-024-69784-8.

Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions.基于机器学习的计算基因选择模型：综述、性能评估、开放问题及未来研究方向

Front Genet. 2020 Dec 10;11:603808. doi: 10.3389/fgene.2020.603808. eCollection 2020.

Combining statistical techniques to predict postsurgical risk of 1-year mortality for patients with colon cancer.结合统计技术预测结肠癌患者术后1年死亡风险。

Clin Epidemiol. 2018 Mar 6;10:235-251. doi: 10.2147/CLEP.S146729. eCollection 2018.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

探索基因表达微阵列数据中的相关性，以进行最大预测-最小冗余生物标志物选择和分类。

Exploring correlations in gene expression microarray data for maximum predictive-minimum redundancy biomarker selection and classification.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献