Hilsenbeck S G, Friedrichs W E, Schiff R, O'Connell P, Hansen R K, Osborne C K, Fuqua S A
Department of Medicine, The University of Texas Health Science Center, San Antonio 78248-7884, USA.
J Natl Cancer Inst. 1999 Mar 3;91(5):453-9. doi: 10.1093/jnci/91.5.453.
Although the emerging complementary DNA (cDNA) array technology holds great promise to discern complex patterns of gene expression, its novelty means that there are no well-established standards to guide analysis and interpretation of the data that it produces. We have used preliminary data generated with the CLONTECH Atlas human cDNA array to develop a practical approach to the statistical analysis of these data by studying changes in gene expression during the development of acquired tamoxifen resistance in breast cancer.
For hybridization to the array, we prepared RNA from MCF-7 human breast cell tumors, isolated from our athymic nude mouse xenograft model of acquired tamoxifen resistance during estrogen-stimulated, tamoxifen-sensitive, and tamoxifen-resistant growth. Principal components analysis was used to identify genes with altered expression.
Principal components analysis yielded three principal components that are interpreted as 1) the average level of gene expression, 2) the difference between estrogen-stimulated gene expression and the average of tamoxifen-sensitive and tamoxifen-resistant gene expression, and 3) the difference between tamoxifen-sensitive and tamoxifen-resistant gene expression. A bivariate (second and third principal components) 99% prediction region was used to identify outlier genes that exhibit altered expression. Two representative outlier genes, erk-2 and HSF-1 (heat shock transcription factor-1), were chosen for confirmatory study, and their predicted relative expression levels were confirmed in western blot analysis, suggesting that semiquantitative estimates are possible with array technology.
Principal components analysis provides a useful and practical method to analyze gene expression data from a cDNA array. The method can identify broad patterns of expression alteration and, based on a small simulation study, will likely provide reasonable power to detect moderate-sized alterations in clinically relevant genes.
尽管新兴的互补DNA(cDNA)阵列技术有望识别复杂的基因表达模式,但其新颖性意味着尚无成熟的标准来指导对其所产生数据的分析和解读。我们利用CLONTECH Atlas人类cDNA阵列产生的初步数据,通过研究乳腺癌获得性他莫昔芬耐药发展过程中的基因表达变化,开发了一种对这些数据进行统计分析的实用方法。
为了与阵列杂交,我们从MCF-7人乳腺细胞肿瘤中制备RNA,这些肿瘤是从我们的无胸腺裸鼠异种移植模型中分离出来的,该模型处于雌激素刺激、他莫昔芬敏感和他莫昔芬耐药生长阶段。主成分分析用于识别表达发生改变的基因。
主成分分析产生了三个主成分,可解释为:1)基因表达的平均水平;2)雌激素刺激的基因表达与他莫昔芬敏感和他莫昔芬耐药基因表达平均值之间的差异;3)他莫昔芬敏感和他莫昔芬耐药基因表达之间的差异。使用双变量(第二和第三主成分)99%预测区域来识别表达发生改变的异常基因。选择两个代表性的异常基因erk-2和HSF-1(热休克转录因子-1)进行验证性研究,其预测的相对表达水平在蛋白质印迹分析中得到证实,这表明利用阵列技术进行半定量估计是可行的。
主成分分析为分析来自cDNA阵列的基因表达数据提供了一种有用且实用的方法。该方法可以识别广泛的表达改变模式,并且基于一项小型模拟研究,可能具有合理的能力来检测临床相关基因的中等大小改变。