Suppr超能文献

分层聚类在黑色素瘤和乳腺癌中的探索性共识。

Exploratory consensus of hierarchical clusterings for melanoma and breast cancer.

机构信息

University of Newcastle, Australia.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2010 Jan-Mar;7(1):138-52. doi: 10.1109/TCBB.2008.33.

Abstract

Finding subtypes of heterogeneous diseases is the biggest challenge in the area of biology. Often, clustering is used to provide a hypothesis for the subtypes of a heterogeneous disease. However, there are usually discrepancies between the clusterings produced by different algorithms. This work introduces a simple method which provides the most consistent clusters across three different clustering algorithms for a melanoma and a breast cancer data set. The method is validated by showing that the Silhouette, Dunne's and Davies-Bouldin's cluster validation indices are better for the proposed algorithm than those obtained by k-means and another consensus clustering algorithm. The hypotheses of the consensus clusters on both the data sets are corroborated by clear genetic markers and 100 percent classification accuracy. In Bittner et al.'s melanoma data set, a previously hypothesized primary cluster is recognized as the largest consensus cluster and a new partition of this cluster into two subclusters is proposed. In van't Veer et al.'s breast cancer data set, previously proposed "basal" and "luminal A" subtypes are clearly recognized as the two predominant clusters. Furthermore, a new hypothesis is provided about the existence of two subgroups within the "basal" subtype in this data set. The clusters of van't Veer's data set is also validated by high classification accuracy obtained in the data set of van de Vijver et al.

摘要

发现异质疾病的亚型是生物学领域最大的挑战。通常,聚类用于为异质疾病的亚型提供假设。然而,不同算法产生的聚类通常存在差异。本工作介绍了一种简单的方法,该方法为黑素瘤和乳腺癌数据集的三种不同聚类算法提供了最一致的聚类。通过显示 Silhouette、Dunne 和 Davies-Bouldin 的聚类验证指标对于所提出的算法优于 k-means 和另一种共识聚类算法获得的指标,验证了该方法的有效性。两个数据集上的共识聚类假设都得到了明确的遗传标记和 100%的分类准确性的支持。在 Bittner 等人的黑素瘤数据集中,先前假设的主要聚类被识别为最大的共识聚类,并提出了该聚类的新分区为两个子聚类。在 van't Veer 等人的乳腺癌数据集中,先前提出的“基底”和“腔 A”亚型被明确识别为两个主要聚类。此外,在该数据集的“基底”亚型中,提供了关于存在两个亚组的新假设。van't Veer 数据集的聚类也通过在 van de Vijver 等人的数据集中获得的高分类准确性得到了验证。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验