Shankavaram Uma T, Reinhold William C, Nishizuka Satoshi, Major Sylvia, Morita Daisaku, Chary Krishna K, Reimers Mark A, Scherf Uwe, Kahn Ari, Dolginow Douglas, Cossman Jeffrey, Kaldjian Eric P, Scudiero Dominic A, Petricoin Emanuel, Liotta Lance, Lee Jae K, Weinstein John N
Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute/NIH, Bethesda, MD 20892, USA.
Mol Cancer Ther. 2007 Mar;6(3):820-32. doi: 10.1158/1535-7163.MCT-06-0650. Epub 2007 Mar 5.
To evaluate the utility of transcript profiling for prediction of protein expression levels, we compared profiles across the NCI-60 cancer cell panel, which represents nine tissues of origin. For that analysis, we present here two new NCI-60 transcript profile data sets (A based on Affymetrix HG-U95 and HG-U133A chips; Affymetrix, Santa Clara, CA) and one new protein profile data set (based on reverse-phase protein lysate arrays). The data sets are available online at http://discover.nci.nih.gov in the CellMiner program package. Using the new transcript data in combination with our previously published cDNA array and Affymetrix HU6800 data sets, we first developed a "consensus set" of transcript profiles based on the four different microarray platforms. Using that set, we found that 65% of the genes showed statistically significant transcript-protein correlation, and the correlations were generally higher than those reported previously for panels of mammalian cells. Using the predictive analysis of microarray nearest shrunken centroid algorithm for functional prediction of tissue of origin, we then found that (a) the consensus mRNA set did better than did data from any of the individual mRNA platforms and (b) the protein data seemed to do somewhat better (P = 0.027) on a gene-for-gene basis in this particular study than did the consensus mRNA data, but both did well. Analysis based on the Gene Ontology showed protein levels of structure-related genes to be well predicted by mRNA levels (mean r = 0.71). Because the transcript-based technologies are more mature and are currently able to assess larger numbers of genes at one time, they continue to be useful, even when the ultimate aim is information about proteins.
为了评估转录本分析在预测蛋白质表达水平方面的效用,我们比较了来自NCI - 60癌细胞系的转录本分析结果,该细胞系代表了9种起源组织。对于该分析,我们在此展示两个新的NCI - 60转录本分析数据集(一个基于Affymetrix HG - U95和HG - U133A芯片;Affymetrix公司,加利福尼亚州圣克拉拉)以及一个新的蛋白质分析数据集(基于反相蛋白质裂解物阵列)。这些数据集可在http://discover.nci.nih.gov的CellMiner程序包中在线获取。我们将新的转录本数据与我们之前发表的cDNA阵列和Affymetrix HU6800数据集相结合,首先基于四个不同的微阵列平台开发了一个转录本分析的“共识集”。利用该集合,我们发现65%的基因显示出具有统计学意义的转录本 - 蛋白质相关性,并且这些相关性总体上高于之前报道的哺乳动物细胞系的相关性。然后,我们使用微阵列最近收缩质心算法的预测分析来对起源组织进行功能预测,结果发现:(a) 共识mRNA集的预测效果优于任何单个mRNA平台的数据;(b) 在这项特定研究中,基于逐个基因的分析,蛋白质数据似乎比共识mRNA数据稍好(P = 0.027),但两者表现都不错。基于基因本体论的分析表明,与结构相关基因的蛋白质水平能够被mRNA水平很好地预测(平均r = 0.71)。由于基于转录本的技术更加成熟,目前能够一次性评估更多数量的基因,所以即使最终目标是获取有关蛋白质的信息,这些技术仍然很有用。