文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Clustering cancer gene expression data: a comparative study.

作者信息

de Souto Marcilio C P, Costa Ivan G, de Araujo Daniel S A, Ludermir Teresa B, Schliep Alexander

机构信息

Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.

出版信息

BMC Bioinformatics. 2008 Nov 27;9:497. doi: 10.1186/1471-2105-9-497.


DOI:10.1186/1471-2105-9-497
PMID:19038021
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2632677/
Abstract

BACKGROUND: The use of clustering methods for the discovery of cancer subtypes has drawn a great deal of attention in the scientific community. While bioinformaticians have proposed new clustering methods that take advantage of characteristics of the gene expression data, the medical community has a preference for using "classic" clustering methods. There have been no studies thus far performing a large-scale evaluation of different clustering methods in this context. RESULTS/CONCLUSION: We present the first large-scale analysis of seven different clustering methods and four proximity measures for the analysis of 35 cancer gene expression data sets. Our results reveal that the finite mixture of Gaussians, followed closely by k-means, exhibited the best performance in terms of recovering the true structure of the data sets. These methods also exhibited, on average, the smallest difference between the actual number of classes in the data sets and the best number of clusters as indicated by our validation criteria. Furthermore, hierarchical methods, which have been widely used by the medical community, exhibited a poorer recovery performance than that of the other methods evaluated. Moreover, as a stable basis for the assessment and comparison of different clustering methods for cancer gene expression data, this study provides a common group of data sets (benchmark data sets) to be shared among researchers and used for comparisons with new methods. The data sets analyzed in this study are available at http://algorithmics.molgen.mpg.de/Supplements/CompCancer/.

摘要

相似文献

[1]
Clustering cancer gene expression data: a comparative study.

BMC Bioinformatics. 2008-11-27

[2]
GenClust: a genetic algorithm for clustering gene expression data.

BMC Bioinformatics. 2005-12-7

[3]
Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm.

Bioinformatics. 2006-1-1

[4]
Modeling and visualizing uncertainty in gene expression clusters using dirichlet process mixtures.

IEEE/ACM Trans Comput Biol Bioinform. 2009

[5]
Comparing the performance of biomedical clustering methods.

Nat Methods. 2015-9-21

[6]
Comparisons and validation of statistical clustering techniques for microarray gene expression data.

Bioinformatics. 2003-3-1

[7]
Simultaneous gene clustering and subset selection for sample classification via MDL.

Bioinformatics. 2003-6-12

[8]
Clustering of gene expression data: performance and similarity analysis.

BMC Bioinformatics. 2006-12-12

[9]
Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery.

BMC Bioinformatics. 2005-4-13

[10]
Evaluation of clustering algorithms for gene expression data.

BMC Bioinformatics. 2006-12-12

引用本文的文献

[1]
Clustering of electronic health records in atrial fibrillation patients and impact on prognosis and patient trajectories: a UK linked-dataset study.

Eur Heart J Digit Health. 2025-4-5

[2]
A Deep Differential Analysis in Four Subtypes of Breast Cancer Based on Regulations of miRNA-mRNA.

IET Syst Biol. 2025

[3]
Sharp-SSL: Selective High-Dimensional Axis-Aligned Random Projections for Semi-Supervised Learning.

J Am Stat Assoc. 2024-4-12

[4]
Multi-way overlapping clustering by Bayesian tensor decomposition.

Stat Interface. 2024

[5]
Evaluation of agreement between common clustering strategies for DNA methylation-based subtyping of breast tumours.

Epigenomics. 2025-2

[6]
Principles of artificial intelligence in radiooncology.

Strahlenther Onkol. 2025-3

[7]
Methods in DNA methylation array dataset analysis: A review.

Comput Struct Biotechnol J. 2024-5-17

[8]
Multi-Input data ASsembly for joint Analysis (MIASA): A framework for the joint analysis of disjoint sets of variables.

PLoS One. 2024

[9]
Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

Data Min Knowl Discov. 2024

[10]
MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms.

Nat Commun. 2023-7-6

本文引用的文献

[1]
A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis.

Multivariate Behav Res. 1986-10-1

[2]
Diagnostic signatures from microarrays: a bioinformatics concept for personalized medicine.

Drug Discov Today. 2004-12-15

[3]
A comparative study of different machine learning methods on microarray gene expression data.

BMC Genomics. 2008

[4]
Techniques for clustering gene expression data.

Comput Biol Med. 2008-3

[5]
Evaluation of clustering algorithms for gene expression data.

BMC Bioinformatics. 2006-12-12

[6]
Integrative molecular concept modeling of prostate cancer progression.

Nat Genet. 2007-1

[7]
Metric for measuring the effectiveness of clustering of DNA microarray expression.

BMC Bioinformatics. 2006-9-6

[8]
NCBI GEO: mining tens of millions of expression profiles--database and tools update.

Nucleic Acids Res. 2007-1

[9]
Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.

BMC Bioinformatics. 2006-8-31

[10]
Serrated carcinomas form a subclass of colorectal cancer with distinct molecular basis.

Oncogene. 2007-1-11

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索