Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York 10065, USA.
Am J Epidemiol. 2012 Sep 15;176(6):512-8. doi: 10.1093/aje/kws128. Epub 2012 Aug 24.
Contemporary searches for new risk factors frequently involve genome-wide explorations of very large numbers of candidate risk variants. Given that diseases can often be classified into subtypes that possess evidence of etiologic heterogeneity, the question arises as to whether or not a search for new risk factors would be improved by looking separately within subtypes. Etiologic risk heterogeneity inevitably increases the signal in at least one of the subtypes, but this advantage may be offset by smaller sample sizes and the increased chances of false discovery. In this article, the authors show that only a relatively modest degree of etiologic heterogeneity is necessary for the subtyping strategies to have improved statistical power. In practice, effective exploitation of etiologic heterogeneity requires strong evidence that the subtypes selected are likely to exhibit substantial heterogeneity. Further, defining the subtypes that demonstrate the most heterogeneous profiles is important for optimizing the search for new risk factors. The concepts are illustrated by using data from a breast cancer study in which results are available separately for estrogen receptor-positive (ER+) and -negative (ER-) tumors.
目前,研究人员经常利用全基因组分析的方法来广泛探索大量候选风险变异,以寻找新的风险因素。由于疾病通常可以分为具有病因异质性证据的亚型,因此就产生了这样一个问题,即针对新的风险因素进行研究时,是否有必要在亚型内分别进行研究。病因风险异质性不可避免地会增加至少一个亚组中的信号,但这一优势可能会因样本量较小和假阳性发现的可能性增加而被抵消。在本文中,作者表明,仅需要相对适度的病因异质性程度,亚组策略就可以提高统计能力。实际上,有效利用病因异质性需要强有力的证据表明所选的亚组很可能表现出实质性的异质性。此外,对于优化新风险因素的搜索,确定表现出最异质特征的亚组很重要。这些概念通过使用乳腺癌研究中的数据进行了说明,该研究的结果可分别用于雌激素受体阳性(ER+)和阴性(ER-)肿瘤。