IRIDIA-CoDE, Université Libre de Bruxelles, Brussels, Belgium.
PLoS Comput Biol. 2011 Oct;7(10):e1002240. doi: 10.1371/journal.pcbi.1002240. Epub 2011 Oct 20.
Bridging the gap between animal or in vitro models and human disease is essential in medical research. Researchers often suggest that a biological mechanism is relevant to human cancer from the statistical association of a gene expression marker (a signature) of this mechanism, that was discovered in an experimental system, with disease outcome in humans. We examined this argument for breast cancer. Surprisingly, we found that gene expression signatures-unrelated to cancer-of the effect of postprandial laughter, of mice social defeat and of skin fibroblast localization were all significantly associated with breast cancer outcome. We next compared 47 published breast cancer outcome signatures to signatures made of random genes. Twenty-eight of them (60%) were not significantly better outcome predictors than random signatures of identical size and 11 (23%) were worst predictors than the median random signature. More than 90% of random signatures >100 genes were significant outcome predictors. We next derived a metagene, called meta-PCNA, by selecting the 1% genes most positively correlated with proliferation marker PCNA in a compendium of normal tissues expression. Adjusting breast cancer expression data for meta-PCNA abrogated almost entirely the outcome association of published and random signatures. We also found that, in the absence of adjustment, the hazard ratio of outcome association of a signature strongly correlated with meta-PCNA (R(2) = 0.9). This relation also applied to single-gene expression markers. Moreover, >50% of the breast cancer transcriptome was correlated with meta-PCNA. A corollary was that purging cell cycle genes out of a signature failed to rule out the confounding effect of proliferation. Hence, it is questionable to suggest that a mechanism is relevant to human breast cancer from the finding that a gene expression marker for this mechanism predicts human breast cancer outcome, because most markers do. The methods we present help to overcome this problem.
在医学研究中,弥合动物或体外模型与人类疾病之间的差距至关重要。研究人员经常从实验系统中发现的机制的基因表达标记(特征)与人类疾病结果的统计关联中推断出该生物学机制与人类癌症有关。我们检查了乳腺癌的这种说法。令人惊讶的是,我们发现与癌症无关的基因表达特征-餐后大笑、老鼠社交挫败和皮肤成纤维细胞定位的影响-与乳腺癌的结果均显着相关。接下来,我们将 47 个已发表的乳腺癌结局特征与由随机基因组成的特征进行了比较。其中 28 个(60%)不比相同大小的随机特征更好地预测结局,11 个(23%)比中位数随机特征更差。超过 90%的随机特征大于 100 个基因都是显着的结局预测因子。接下来,我们通过在正常组织表达的摘要中选择与增殖标记物 PCNA 最正相关的 1%基因,衍生出一个称为 meta-PCNA 的元基因。在调整乳腺癌表达数据时,meta-PCNA 几乎完全消除了已发表和随机特征与结局的关联。我们还发现,在没有调整的情况下,与 meta-PCNA 强烈相关的特征与结局关联的风险比(R(2)= 0.9)。这种关系也适用于单基因表达标志物。此外,meta-PCNA 与超过 50%的乳腺癌转录组相关。推论是,从该机制的基因表达标记预测人类乳腺癌结局的发现中,该机制与人类乳腺癌相关,因为大多数标记都与人类乳腺癌相关。我们提出的方法有助于克服此问题。