利用微阵列数据通过特征选择和模糊c均值聚类进行肿瘤分类和标记基因预测。

Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data.

作者信息

Wang Junbai, Bø Trond Hellem, Jonassen Inge, Myklebost Ola, Hovig Eivind

机构信息

Department of Tumor Biology, The Norwegian Radium Hospital, N0310 Oslo, Norway.

出版信息

BMC Bioinformatics. 2003 Dec 2;4:60. doi: 10.1186/1471-2105-4-60.

DOI:10.1186/1471-2105-4-60

PMID:14651757

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC302113/

Abstract

BACKGROUND

Using DNA microarrays, we have developed two novel models for tumor classification and target gene prediction. First, gene expression profiles are summarized by optimally selected Self-Organizing Maps (SOMs), followed by tumor sample classification by Fuzzy C-means clustering. Then, the prediction of marker genes is accomplished by either manual feature selection (visualizing the weighted/mean SOM component plane) or automatic feature selection (by pair-wise Fisher's linear discriminant).

RESULTS

The proposed models were tested on four published datasets: (1) Leukemia (2) Colon cancer (3) Brain tumors and (4) NCI cancer cell lines. The models gave class prediction with markedly reduced error rates compared to other class prediction approaches, and the importance of feature selection on microarray data analysis was also emphasized.

CONCLUSIONS

Our models identify marker genes with predictive potential, often better than other available methods in the literature. The models are potentially useful for medical diagnostics and may reveal some insights into cancer classification. Additionally, we illustrated two limitations in tumor classification from microarray data related to the biology underlying the data, in terms of (1) the class size of data, and (2) the internal structure of classes. These limitations are not specific for the classification models used.

摘要

背景

我们利用DNA微阵列开发了两种用于肿瘤分类和靶基因预测的新模型。首先，通过最优选择的自组织映射（SOM）总结基因表达谱，然后通过模糊C均值聚类对肿瘤样本进行分类。接着，通过手动特征选择（可视化加权/平均SOM分量平面）或自动特征选择（通过成对的Fisher线性判别）来完成标记基因的预测。

结果

所提出的模型在四个已发表的数据集上进行了测试：（1）白血病（2）结肠癌（3）脑肿瘤和（4）NCI癌细胞系。与其他分类预测方法相比，这些模型给出的分类预测错误率显著降低，并且还强调了特征选择在微阵列数据分析中的重要性。

结论

我们的模型能够识别具有预测潜力的标记基因，通常比文献中其他可用方法更好。这些模型在医学诊断中可能有用，并且可能揭示一些关于癌症分类的见解。此外，我们从微阵列数据相关的生物学角度说明了肿瘤分类中的两个局限性，即（1）数据的类别大小，以及（2）类别的内部结构。这些局限性并非所使用的分类模型所特有。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f24c/302113/7d267a34e211/1471-2105-4-60-1.jpg

相似文献

Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data.

BMC Bioinformatics. 2003 Dec 2;4:60. doi: 10.1186/1471-2105-4-60.

An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.

Bioinformatics. 2003 Nov 1;19(16):2131-40. doi: 10.1093/bioinformatics/btg296.

Tumor classification by partial least squares using microarray gene expression data.

Bioinformatics. 2002 Jan;18(1):39-50. doi: 10.1093/bioinformatics/18.1.39.

SamCluster: an integrated scheme for automatic discovery of sample classes using gene expression profile.

Bioinformatics. 2003 May 1;19(7):811-7. doi: 10.1093/bioinformatics/btg095.

Simultaneous gene clustering and subset selection for sample classification via MDL.

Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039.

Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes.

BMC Bioinformatics. 2004 Jun 24;5:81. doi: 10.1186/1471-2105-5-81.

Tclass: tumor classification system based on gene expression profile.

Bioinformatics. 2002 Feb;18(2):325-6. doi: 10.1093/bioinformatics/18.2.325.

Gene expression data classification using consensus independent component analysis.

Genomics Proteomics Bioinformatics. 2008 Jun;6(2):74-82. doi: 10.1016/S1672-0229(08)60022-4.

Developing optimal prediction models for cancer classification using gene expression data.

J Bioinform Comput Biol. 2004 Jan;1(4):681-94. doi: 10.1142/s0219720004000351.

Computational methods for gene expression-based tumor classification.

Biotechniques. 2000 Dec;29(6):1264-8, 1270. doi: 10.2144/00296bc02.

引用本文的文献

Utilizing Feature Selection Techniques for AI-Driven Tumor Subtype Classification: Enhancing Precision in Cancer Diagnostics.

Biomolecules. 2025 Jan 8;15(1):81. doi: 10.3390/biom15010081.

Signature Genes Selection and Functional Analysis of Astrocytoma Phenotypes: A Comparative Study.

Cancers (Basel). 2024 Sep 25;16(19):3263. doi: 10.3390/cancers16193263.

Metaheuristic integrated machine learning classification of colon cancer using STFT LASSO and EHO feature extraction from microarray gene expressions.

Sci Rep. 2024 Jul 17;14(1):16485. doi: 10.1038/s41598-024-67135-1.

Consensus clustering methodology to improve molecular stratification of non-small cell lung cancer.

Sci Rep. 2023 May 12;13(1):7759. doi: 10.1038/s41598-023-33954-x.

Nuclear IL-33 restrains the early conversion of fibroblasts to an extracellular matrix-secreting phenotype.

Sci Rep. 2021 Jan 8;11(1):108. doi: 10.1038/s41598-020-80509-5.

Application of artificial neural network to investigate the effects of 5-fluorouracil on ribonucleotides and deoxyribonucleotides in HepG2 cells.

Sci Rep. 2015 Nov 18;5:16861. doi: 10.1038/srep16861.

Comprehensive genome-wide transcription factor analysis reveals that a combination of high affinity and low affinity DNA binding is needed for human gene regulation.

BMC Genomics. 2015;16 Suppl 7(Suppl 7):S12. doi: 10.1186/1471-2164-16-S7-S12. Epub 2015 Jun 11.

Immunomodulatory effects of the Agaricus blazei Murrill-based mushroom extract AndoSan in patients with multiple myeloma undergoing high dose chemotherapy and autologous stem cell transplantation: a randomized, double blinded clinical study.

Biomed Res Int. 2015;2015:718539. doi: 10.1155/2015/718539. Epub 2015 Jan 18.

Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

Front Psychol. 2014 Apr 23;5:343. doi: 10.3389/fpsyg.2014.00343. eCollection 2014.

Identification of significant features in DNA microarray data.

Wiley Interdiscip Rev Comput Stat. 2013 Jul;5(4). doi: 10.1002/wics.1260.

本文引用的文献

Predicting gene ontology biological process from temporal gene expression patterns.

Genome Res. 2003 May;13(5):965-79. doi: 10.1101/gr.1144503. Epub 2003 Apr 14.

Supervised clustering of genes.

Genome Biol. 2002;3(12):RESEARCH0069. doi: 10.1186/gb-2002-3-12-research0069. Epub 2002 Nov 25.

Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study.

BMC Bioinformatics. 2002 Nov 24;3:36. doi: 10.1186/1471-2105-3-36.

Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering.

Genome Biol. 2002 Oct 10;3(11):RESEARCH0059. doi: 10.1186/gb-2002-3-11-research0059.

Tumor classification by partial least squares using microarray gene expression data.

Bioinformatics. 2002 Jan;18(1):39-50. doi: 10.1093/bioinformatics/18.1.39.

Prediction of central nervous system embryonal tumour outcome based on gene expression.

Nature. 2002 Jan 24;415(6870):436-42. doi: 10.1038/415436a.

Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.

Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10869-74. doi: 10.1073/pnas.191367098.

Molecular classification of multiple tumor types.

Bioinformatics. 2001;17 Suppl 1:S316-22. doi: 10.1093/bioinformatics/17.suppl_1.s316.

Missing value estimation methods for DNA microarrays.

Bioinformatics. 2001 Jun;17(6):520-5. doi: 10.1093/bioinformatics/17.6.520.

Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks.

Nat Med. 2001 Jun;7(6):673-9. doi: 10.1038/89044.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用微阵列数据通过特征选择和模糊c均值聚类进行肿瘤分类和标记基因预测。

Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献