Li Huai, Zhan Ming
Bioinformatics Unit, Research Resources Branch, National Institute on Aging, NIH, Baltimore, MD 21224, USA.
J Proteomics Bioinform. 2009 Mar 12;2:117. doi: 10.4172/jpb.1000068.
Cross-species comparison of gene expression profiles allows deciphering fundamental and species-specific transcriptional programs of cells and offers insight into organization and evolution of the genome and genetic network. Here, we propose an algorithm for comparing microarray data from different species to unravel transcriptional modules that are conserved or divergent through evolution. The proposed algorithm is based on cross-species matrix decomposition that includes a nonlinear independent component analysis followed a generalized probabilistic sparse matrix factorization on microarray data from different species. The proposed algorithm captures transcriptional modularity that might result from highly nonlinear interactions among genes, and partitions genes into mutually non-exclusive transcriptional modules. The conserved transcriptional modules are identified by the latent variables that are associated with predominant biological prototypes shared across species. We illustrated the application of the proposed algorithm by an analysis of human and mouse embryonic stem cell (ESC) data. The analysis uncovered conserved and divergent transcriptional modules in the ESC transcriptomes, shedding light on the understanding of fundamental and species-specific regulatory mechanisms controlling ESC development.
基因表达谱的跨物种比较有助于解读细胞的基本转录程序和物种特异性转录程序,并为基因组和遗传网络的组织与进化提供见解。在此,我们提出一种算法,用于比较来自不同物种的微阵列数据,以揭示在进化过程中保守或分化的转录模块。所提出的算法基于跨物种矩阵分解,该分解包括对来自不同物种的微阵列数据进行非线性独立成分分析,然后进行广义概率稀疏矩阵分解。所提出的算法捕获可能由基因之间高度非线性相互作用产生的转录模块性,并将基因划分为相互不排斥的转录模块。保守的转录模块通过与跨物种共享的主要生物学原型相关联的潜在变量来识别。我们通过对人类和小鼠胚胎干细胞(ESC)数据的分析说明了所提出算法的应用。该分析揭示了ESC转录组中保守和分化的转录模块,有助于深入了解控制ESC发育的基本和物种特异性调控机制。