Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York 10065, U.S.A.
Stat Med. 2013 Dec 20;32(29):5039-52. doi: 10.1002/sim.5902. Epub 2013 Jul 16.
Cancer has traditionally been studied using the disease site of origin as the organizing framework. However, recent advances in molecular genetics have begun to challenge this taxonomy, as detailed molecular profiling of tumors has led to discoveries of subsets of tumors that have profiles that possess distinct clinical and biological characteristics. This is increasingly leading to research that seeks to investigate whether these subtypes of tumors have distinct etiologies. However, research in this field has been opportunistic and anecdotal, typically involving the comparison of distributions of individual risk factors between tumors classified on the basis of candidate tumor characteristics. The purpose of this article is to place this area of investigation within a more general conceptual and analytic framework, with a view to providing more efficient and practical strategies for designing and analyzing epidemiologic studies to investigate etiologic heterogeneity. We propose a formal definition of etiologic heterogeneity and show how classifications of tumor subtypes with larger etiologic heterogeneities inevitably possess greater disease risk predictability overall. We outline analytic strategies for estimating the degree of etiologic heterogeneity among a set of subtypes and for choosing subtypes that optimize the heterogeneity, and we discuss technical challenges that require further methodologic research. We illustrate the ideas by using a pooled case-control study of breast cancer classified by expression patterns of genes known to define distinct tumor subtypes.
癌症传统上是根据起源部位来进行研究的。然而,分子遗传学的最新进展开始对这种分类法提出挑战,因为对肿瘤的详细分子分析发现,肿瘤可以分为不同的亚组,其特征具有明显的临床和生物学特征。这越来越多地导致研究人员探索这些肿瘤亚组是否具有不同的病因。然而,该领域的研究一直是机会主义的和传闻的,通常涉及根据候选肿瘤特征对肿瘤进行分类后,比较单个危险因素在肿瘤之间的分布。本文的目的是将这一研究领域置于一个更广泛的概念和分析框架内,以期为设计和分析流行病学研究以探讨病因异质性提供更有效和实用的策略。我们提出了病因异质性的正式定义,并表明具有较大病因异质性的肿瘤亚组分类不可避免地具有更高的总体疾病风险预测能力。我们概述了用于估计一组亚组之间病因异质性程度以及选择最佳优化异质性的亚组的分析策略,并讨论了需要进一步进行方法学研究的技术挑战。我们通过使用基于已知定义不同肿瘤亚型的基因表达模式对乳腺癌进行的一项病例对照研究来说明这些想法。