Cao Dengfeng, Hustinx Steven R, Sui Guoping, Bala P, Sato Norihiro, Martin Sean, Maitra Anirban, Murphy Kathleen M, Cameron John L, Yeo Charles J, Kern Scott E, Goggins Michael, Pandey Akhilesh, Hruban Ralph H
Department of Pathology, The Johns Hopkins Medical Institutions, Baltimore, Maryland 21231, USA.
Cancer Biol Ther. 2004 Nov;3(11):1081-9; discussion 1090-1. doi: 10.4161/cbt.3.11.1175. Epub 2004 Nov 12.
In most microarray experiments, a significant fraction of the differentially expressed mRNAs identified correspond to expressed sequence tags (ESTs) and are generally discarded from further analyses. We used careful bioinformatics analyses to characterize those ESTs that were found to be highly overexpressed in a series of pancreatic adenocarcinomas. cDNA was prepared from 60 non-neoplastic samples (normal pancreas [n = 20], normal colon [n = 10], or normal duodenal mucosal [n = 30]) and from 64 pancreatic cancers (resected cancers [n = 50] or cancer cell lines [n = 14]) and hybridized to the complete Affymetrix Human Genome U133 GeneChip(R) set (arrays U133A and B) for simultaneous analysis of 45,000 fragments corresponding to 33,000 known genes and 6,000 ESTs. The GeneExpress(R) software system Fold Change Analysis Tool was used and 60 ESTs were identified that were expressed at levels at least 3-fold greater in the pancreatic cancers as compared to normal tissues. Searches against the human genomic sequence and comparative genomic analysis of human and mouse genomes was carried out using basic local alignment search tools (BLAST), BLASTN, and BLASTX, for identifying protein coding genes corresponding to the ESTs. Subsequently, in order to pick the most relevant candidate genes for a more detailed analysis, we looked for domains/motifs in the open reading frames using SMART and Pfam programs. We were able to definitively map 43 of the 60 ESTs to known or novel genes, and 15 of the ESTs could be localized in close proximity to a gene in the human genome although we were unable to establish that the EST was indeed derived from those genes. The differential expression of a subset of genes was confirmed at the protein level by immunohistochemical labeling of tissue microarrays (inhibin beta A [INHBA] and CD29) and/or at the transcript level by RT-PCR (INHBA, AKAP12, ELK3, FOXQ1, EIF5A2, and EFNA5). We conclude that bioinformatics tools can be used to characterize differentially overexpressed ESTs, and that some of these ESTs may represent diagnostically and therapeutically useful targets that might be missed using data solely from currently annotated databases.
在大多数微阵列实验中,鉴定出的差异表达mRNA中有很大一部分对应于表达序列标签(EST),通常会在进一步分析中被舍弃。我们运用了细致的生物信息学分析来表征那些在一系列胰腺腺癌中被发现高度过表达的EST。从60个非肿瘤样本(正常胰腺[n = 20]、正常结肠[n = 10]或正常十二指肠黏膜[n = 30])以及64个胰腺癌样本(切除的癌症样本[n = 50]或癌细胞系[n = 14])中制备cDNA,并与完整的Affymetrix人类基因组U133基因芯片组(U133A和B阵列)杂交,以便同时分析对应于33,000个已知基因和6,000个EST的45,000个片段。使用GeneExpress软件系统的倍数变化分析工具,鉴定出60个EST,它们在胰腺癌中的表达水平比正常组织至少高3倍。利用基本局部比对搜索工具(BLAST)、BLASTN和BLASTX对人类基因组序列进行搜索,并对人类和小鼠基因组进行比较基因组分析,以鉴定与EST对应的蛋白质编码基因。随后,为了挑选出最相关的候选基因进行更详细的分析,我们使用SMART和Pfam程序在开放阅读框中寻找结构域/基序。我们能够明确地将60个EST中的43个定位到已知或新基因上,并且15个EST可以定位在人类基因组中与某个基因紧密相邻的位置,尽管我们无法确定该EST确实源自那些基因。通过组织微阵列的免疫组织化学标记(抑制素βA [INHBA]和CD29)在蛋白质水平以及/或者通过RT-PCR(INHBA、AKAP12、ELK3、FOXQ1、EIF5A2和EFNA5)在转录水平证实了一部分基因的差异表达。我们得出结论,生物信息学工具可用于表征差异过表达的EST,并且其中一些EST可能代表诊断和治疗上有用的靶点,而仅使用当前注释数据库中的数据可能会遗漏这些靶点。