Yu Kun, Ganesan Kumaresan, Miller Lance D, Tan Patrick
National Cancer Centre, Singapore, Republic of Singapore.
Clin Cancer Res. 2006 Jun 1;12(11 Pt 1):3288-96. doi: 10.1158/1078-0432.CCR-05-1530.
Previous reports using genome-wide gene expression data to classify breast tumors have typically used standard unsupervised or supervised techniques, both of which have known limitations. We hypothesized that novel clinically relevant information could be revealed in these data sets by an alternative analytic approach. Using a recently described algorithm, signature analysis (SA), we identified "modules," comprising groups of tightly coexpressed genes that are conditionally linked to particular tumors, in a series of breast tumor gene expression profiles.
The SA successfully identified multiple breast cancer modules specifically linked to distinct biological functions. We identified a novel module, TuM1, whose presence was not readily discernible by conventional clustering techniques. The TuM1 module is expressed in a subset of estrogen receptor (ER)-positive tumors and is significantly enriched with genes involved in apoptosis and cell death. Clinically, TuM1-expressing tumors are associated with low histopathologic grade, and this association is independent of the inherent ER status of a tumor. We confirmed the robustness and general applicability of TuM1 module by demonstrating its association with low tumor grade in multiple independent breast cancer data sets generated using different array technologies. In vitro, the TuM1 module is down-regulated in ER+ MCF7 cells upon treatment with tamoxifen, suggesting that TuM1 expression may be dependent on active signaling by ER. Initial data is also suggestive that TuM1 expression may be clinically associated with a patient's response to antihormonal therapy.
Our results suggest that modular-based approaches toward gene expression data can prove useful in identifying novel, robust, and biologically relevant signatures even from data sets that have been the subject of substantial prior analysis.
以往利用全基因组基因表达数据对乳腺肿瘤进行分类的报告通常采用标准的无监督或监督技术,这两种技术都存在已知的局限性。我们推测,通过一种替代分析方法可以在这些数据集中揭示新的临床相关信息。我们使用最近描述的一种算法,即特征分析(SA),在一系列乳腺肿瘤基因表达谱中识别出“模块”,这些模块由紧密共表达的基因组成,它们与特定肿瘤有条件地相关联。
SA成功识别出多个与不同生物学功能特异性相关的乳腺癌模块。我们识别出一个新的模块TuM1,传统聚类技术不易察觉其存在。TuM1模块在一部分雌激素受体(ER)阳性肿瘤中表达,并且显著富集了参与凋亡和细胞死亡的基因。在临床上,表达TuM1的肿瘤与低组织病理学分级相关,并且这种关联独立于肿瘤固有的ER状态。我们通过在使用不同阵列技术生成的多个独立乳腺癌数据集中证明TuM1模块与低肿瘤分级的关联,证实了TuM1模块的稳健性和普遍适用性。在体外,用他莫昔芬处理后,ER+ MCF7细胞中TuM1模块下调,这表明TuM1表达可能依赖于ER的活性信号传导。初步数据还表明TuM1表达可能在临床上与患者对抗激素治疗的反应相关。
我们的结果表明,基于模块的基因表达数据分析方法在识别新的、稳健的和生物学相关的特征方面可能是有用的,即使是对于那些已经进行了大量前期分析的数据集。