Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York 10021, USA.
Clin Cancer Res. 2010 Mar 1;16(5):1358-67. doi: 10.1158/1078-0432.CCR-09-2398. Epub 2010 Feb 23.
In recent years several investigative groups have sought to use array technologies that characterize somatic alterations in tumors, such as array comparative genomic hybridization (ACGH), to classify pairs of tumors from the same patients as either independent primary cancers or metastases. A wide variety of strategies have been proposed. Several groups have endeavored to use hierarchical clustering for this purpose. This technique was popularized in genomics as a means of finding clusters of patients with similar gene expression patterns with a view to finding subcategories of tumors with distinct clinical characteristics. Unfortunately, this method is not well suited to the problem of classifying individual pairs of tumors as either clonal or independent. In this article we show why hierarchical clustering is unsuitable for this purpose, and why this method has the paradoxical property of producing a declining probability that clonal tumor pairs will be correctly identified as more information is accrued (i.e., more patients). We discuss alternative strategies that have been proposed, which are based on more conventional conceptual formulations for statistical testing and diagnosis, and point to the remaining challenges in constructing valid and robust techniques for this problem.
近年来,一些研究小组试图使用能够描述肿瘤体细胞改变的阵列技术,如阵列比较基因组杂交(ACGH),将来自同一患者的成对肿瘤分类为独立的原发性癌症或转移瘤。已经提出了各种各样的策略。一些小组努力为此目的使用层次聚类。这种技术在基因组学中很流行,是一种寻找具有相似基因表达模式的患者群的方法,目的是找到具有不同临床特征的肿瘤亚类。不幸的是,这种方法不适合将单个肿瘤对分类为克隆或独立的问题。在本文中,我们将展示为什么层次聚类不适合这个目的,以及为什么这种方法具有一个矛盾的特性,即随着积累的信息量(即更多的患者)增加,正确识别克隆肿瘤对的概率会下降。我们讨论了已经提出的替代策略,这些策略基于更传统的统计测试和诊断概念公式,并指出在构建针对该问题的有效和稳健技术方面仍然存在挑战。