Zhang Shaohong, Poon Ellen, Xie Dongqing, Boheler Kenneth R, Li Ronald A, Wong Hau-San
Department of Computer Science, Guangzhou University, Guangzhou, P.R. China.
Stem Cell & Regenerative Medicine Consortium, University of Hong Kong, Hong Kong.
PLoS One. 2015 May 4;10(5):e0125442. doi: 10.1371/journal.pone.0125442. eCollection 2015.
Global transcriptional analyses have been performed with human embryonic stem cells (hESC) derived cardiomyocytes (CMs) to identify molecules and pathways important for human CM differentiation, but variations in culture and profiling conditions have led to greatly divergent results among different studies. Consensus investigation to identify genes and gene sets enriched in multiple studies is important for revealing differential gene expression intrinsic to human CM differentiation independent of the above variables, but reliable methods of conducting such comparison are lacking. We examined differential gene expression between hESC and hESC-CMs from multiple microarray studies. For single gene analysis, we identified genes that were expressed at increased levels in hESC-CMs in seven datasets and which have not been previously highlighted. For gene set analysis, we developed a new algorithm, consensus comparative analysis (CSSCMP), capable of evaluating enrichment of gene sets from heterogeneous data sources. Based on both theoretical analysis and experimental validation, CSSCMP is more efficient and less susceptible to experimental variations than traditional methods. We applied CSSCMP to hESC-CM microarray data and revealed novel gene set enrichment (e.g., glucocorticoid stimulus), and also identified genes that might mediate this response. Our results provide important molecular information intrinsic to hESC-CM differentiation. Data and Matlab codes can be downloaded from S1 Data.
已对源自人类胚胎干细胞(hESC)的心肌细胞(CM)进行了全基因组转录分析,以确定对人类CM分化重要的分子和信号通路,但培养和分析条件的差异导致不同研究结果差异很大。开展共识性研究以确定在多项研究中富集的基因和基因集,对于揭示人类CM分化内在的差异基因表达(独立于上述变量)很重要,但缺乏进行此类比较的可靠方法。我们检查了来自多个微阵列研究的hESC和hESC-CM之间的差异基因表达。对于单基因分析,我们鉴定出在七个数据集中hESC-CM中表达水平升高且此前未被重点关注的基因。对于基因集分析,我们开发了一种新算法,即共识比较分析(CSSCMP),能够评估来自异质数据源的基因集富集情况。基于理论分析和实验验证,CSSCMP比传统方法更高效,且更不易受实验变异的影响。我们将CSSCMP应用于hESC-CM微阵列数据,揭示了新的基因集富集情况(例如,糖皮质激素刺激),还鉴定出可能介导这种反应的基因。我们的结果提供了hESC-CM分化内在的重要分子信息。数据和Matlab代码可从S1 Data下载。