Suppr超能文献

提高从微阵列功能基因组学数据中进行基因选择的可靠性。

Improving reliability of gene selection from microarray functional genomics data.

作者信息

Fu Li M, Youn Eun Seog

机构信息

University of Florida, Gainesville, FL 32611, USA.

出版信息

IEEE Trans Inf Technol Biomed. 2003 Sep;7(3):191-6. doi: 10.1109/titb.2003.816558.

Abstract

Constructing a classifier based on microarray gene expression data has recently emerged as an important problem for cancer classification. Recent results have suggested the feasibility of constructing such a classifier with reasonable predictive accuracy under the circumstance where only a small number of cancer tissue samples of known type are available. Difficulty arises from the fact that each sample contains the expression data of a vast number of genes and these genes may interact with one another. Selection of a small number of critical genes is fundamental to correctly analyze the otherwise overwhelming data. It is essential to use a multivariate approach for capturing the correlated structure in the data. However, the curse of dimensionality leads to the concern about the reliability of selected genes. Here, we present a new gene selection method in which error and repeatability of selected genes are assessed within the context of M-fold cross-validation. In particular, we show that the method is able to identify source variables underlying data generation.

摘要

基于微阵列基因表达数据构建分类器最近已成为癌症分类中的一个重要问题。最近的结果表明,在仅有少量已知类型的癌组织样本可用的情况下,构建具有合理预测准确性的此类分类器是可行的。困难在于每个样本都包含大量基因的表达数据,并且这些基因可能相互作用。选择少量关键基因是正确分析原本海量数据的基础。使用多变量方法来捕捉数据中的相关结构至关重要。然而,维度诅咒引发了对所选基因可靠性的担忧。在此,我们提出一种新的基因选择方法,其中在M折交叉验证的背景下评估所选基因的误差和可重复性。特别是,我们表明该方法能够识别数据生成背后的源变量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验