Begg Colin B, Seshan Venkatraman E, Zabor Emily C, Furberg Helena, Arora Arshi, Shen Ronglai, Maranchie Jodi K, Nielsen Matthew E, Rathmell W Kimryn, Signoretti Sabina, Tamboli Pheroze, Karam Jose A, Choueiri Toni K, Hakimi A Ari, Hsieh James J
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
BMC Med Res Methodol. 2014 Dec 22;14:138. doi: 10.1186/1471-2288-14-138.
The etiologic heterogeneity of cancer has traditionally been investigated by comparing risk factor frequencies within candidate sub-types, defined for example by histology or by distinct tumor markers of interest. Increasingly tumors are being profiled for molecular features much more extensively. This greatly expands the opportunities for defining distinct sub-types. In this article we describe an exploratory analysis of the etiologic heterogeneity of clear cell kidney cancer. Data are available on the primary known risk factors for kidney cancer, while the tumors are characterized on a genome-wide basis using expression, methylation, copy number and mutational profiles.
We use a novel clustering strategy to identify sub-types. This is accomplished independently for the expression, methylation and copy number profiles. The goals are to identify tumor sub-types that are etiologically distinct, to identify the risk factors that define specific sub-types, and to endeavor to characterize the key genes that appear to represent the principal features of the distinct sub-types.
The analysis reveals strong evidence that gender represents an important factor that distinguishes disease sub-types. The sub-types defined using expression data and methylation data demonstrate considerable congruence and are also clearly correlated with mutations in important cancer genes. These sub-types are also strongly correlated with survival. The complexity of the data presents many analytical challenges including, prominently, the risk of false discovery.
Genomic profiling of tumors offers the opportunity to identify etiologically distinct sub-types, paving the way for a more refined understanding of cancer etiology.
癌症的病因异质性传统上是通过比较候选亚型内的危险因素频率来研究的,这些亚型例如由组织学或感兴趣的不同肿瘤标志物定义。越来越多的肿瘤正在更广泛地进行分子特征分析。这极大地扩展了定义不同亚型的机会。在本文中,我们描述了对透明细胞肾癌病因异质性的探索性分析。关于肾癌的主要已知危险因素有数据可用,同时使用表达、甲基化、拷贝数和突变谱在全基因组范围内对肿瘤进行特征描述。
我们使用一种新颖的聚类策略来识别亚型。这是针对表达、甲基化和拷贝数谱独立完成的。目标是识别病因上不同的肿瘤亚型,识别定义特定亚型的危险因素,并努力表征似乎代表不同亚型主要特征的关键基因。
分析揭示了有力证据,表明性别是区分疾病亚型的一个重要因素。使用表达数据和甲基化数据定义的亚型显示出相当大的一致性,并且也与重要癌症基因中的突变明显相关。这些亚型也与生存密切相关。数据的复杂性带来了许多分析挑战,其中突出的是错误发现的风险。
肿瘤的基因组分析为识别病因上不同的亚型提供了机会,为更精确地理解癌症病因铺平了道路。