Michaud Dennis J, Marsh Adam G, Dhurjati Prasad S
Department of Chemical Engineering, University of Delaware, Newark, DE 19716, USA.
Bioinformatics. 2003 Jun 12;19(9):1140-6. doi: 10.1093/bioinformatics/btg132.
Experimental gene expression data sets, such as those generated by microarray or gene chip experiments, typically have significant noise and complicated interconnectivities that make understanding even simple regulatory patterns difficult. Given these complications, characterizing the effectiveness of different analysis techniques to uncover network groups and structures remains a challenge. Generating simulated expression patterns with known biological features of expression complexity, diversity and interconnectivities provides a more controlled means of investigating the appropriateness of different analysis methods. A simulation-based approach can systematically evaluate different gene expression analysis techniques and provide a basis for improved methods in dynamic metabolic network reconstruction.
We have developed an on-line simulator, called eXPatGen, to generate dynamic gene expression patterns typical of microarray experiments. eXPatGen provides a quantitative network structure to represent key biological features, including the induction, repression, and cascade regulation of messenger RNA (mRNA). The simulation is modular such that the expression model can be replaced with other representations, depending on the level of biological detail required by the user. Two example gene networks, of 25 and 100 genes respectively, were simulated. Two standard analysis techniques, clustering and PCA analysis, were performed on the resulting expression patterns in order to demonstrate how the simulator might be used to evaluate different analysis methods and provide experimental guidance for biological studies of gene expression.
实验性基因表达数据集,如通过微阵列或基因芯片实验生成的数据集,通常具有显著的噪声和复杂的相互连接性,这使得理解即使是简单的调控模式也很困难。鉴于这些复杂性,表征不同分析技术揭示网络组和结构的有效性仍然是一个挑战。生成具有已知表达复杂性、多样性和相互连接性生物学特征的模拟表达模式,为研究不同分析方法的适用性提供了一种更可控的手段。基于模拟的方法可以系统地评估不同的基因表达分析技术,并为动态代谢网络重建中改进方法提供基础。
我们开发了一个名为eXPatGen的在线模拟器,以生成微阵列实验典型的动态基因表达模式。eXPatGen提供了一个定量网络结构来表示关键生物学特征,包括信使核糖核酸(mRNA)的诱导、抑制和级联调控。模拟是模块化的,这样根据用户所需的生物学细节水平,可以用其他表示方式替换表达模型。分别模拟了两个包含25个和100个基因的示例基因网络。对生成的表达模式进行了两种标准分析技术,即聚类分析和主成分分析,以展示模拟器如何用于评估不同的分析方法,并为基因表达的生物学研究提供实验指导。