Ertel Adam, Tozeren Aydin
Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, USA.
BMC Genomics. 2008 Jan 4;9:3. doi: 10.1186/1471-2164-9-3.
Recent studies have placed gene expression in the context of distribution profiles including housekeeping, graded, and bimodal (switch-like). Single-gene studies have shown bimodal expression results from healthy cell signaling and complex diseases such as cancer, however developing a comprehensive list of human bimodal genes has remained a major challenge due to inherent noise in human microarray data. This study presents a two-component mixture analysis of mouse gene expression data for genes on the Affymetrix MG-U74Av2 array for the detection and annotation of switch-like genes. Two-component normal mixtures were fit to the data to identify bimodal genes and their potential roles in cell signaling and disease progression.
Seventeen percent of the genes on the MG-U74Av2 array (1519 out of 9091) were identified as bimodal or switch-like. KEGG pathways significantly enriched for bimodal genes included ECM-receptor interaction, cell communication, and focal adhesion. Similarly, the GO biological process "cell adhesion" and cellular component "extracellular matrix" were significantly enriched. Switch-like genes were found to be associated with such diseases as congestive heart failure, Alzheimer's disease, arteriosclerosis, breast neoplasms, hypertension, myocardial infarction, obesity, rheumatoid arthritis, and type I and type II diabetes. In diabetes alone, over two hundred bimodal genes were in a different mode of expression compared to normal tissue.
This research identified and annotated bimodal or switch-like genes in the mouse genome using a large collection of microarray data. Genes with bimodal expression were enriched within the cell membrane and extracellular environment. Hundreds of bimodal genes demonstrated alternate modes of expression in diabetic muscle, pancreas, liver, heart, and adipose tissue. Bimodal genes comprise a candidate set of biomarkers for a large number of disease states because their expressions are tightly regulated at the transcription level.
近期研究已将基因表达置于包括管家基因、分级基因和双峰(开关样)基因在内的分布谱背景下。单基因研究表明,双峰表达源于健康细胞信号传导以及诸如癌症等复杂疾病,然而,由于人类微阵列数据中存在固有噪声,编制一份完整的人类双峰基因列表仍然是一项重大挑战。本研究对Affymetrix MG-U74Av2阵列上的小鼠基因表达数据进行了双组分混合分析,以检测和注释开关样基因。将双组分正态混合模型应用于数据,以识别双峰基因及其在细胞信号传导和疾病进展中的潜在作用。
MG-U74Av2阵列上17%的基因(9091个中的1519个)被鉴定为双峰或开关样基因。显著富集双峰基因的KEGG通路包括细胞外基质受体相互作用、细胞通讯和粘着斑。同样,GO生物学过程“细胞粘附”和细胞组分“细胞外基质”也显著富集。发现开关样基因与充血性心力衰竭、阿尔茨海默病、动脉硬化、乳腺肿瘤、高血压、心肌梗死、肥胖症、类风湿性关节炎以及I型和II型糖尿病等疾病相关。仅在糖尿病中,就有两百多个双峰基因与正常组织相比呈现不同的表达模式。
本研究利用大量微阵列数据在小鼠基因组中鉴定并注释了双峰或开关样基因。具有双峰表达的基因在细胞膜和细胞外环境中富集。数百个双峰基因在糖尿病肌肉、胰腺、肝脏、心脏和脂肪组织中表现出不同的表达模式。双峰基因构成了大量疾病状态的候选生物标志物集,因为它们的表达在转录水平受到严格调控。