Wang Chen, Xuan Jianhua, Chen Li, Zhao Po, Wang Yue, Clarke Robert, Hoffman Eric
Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA.
BMC Bioinformatics. 2008;9 Suppl 1(Suppl 1):S21. doi: 10.1186/1471-2105-9-S1-S21.
Network Component Analysis (NCA) has shown its effectiveness in discovering regulators and inferring transcription factor activities (TFAs) when both microarray data and ChIP-on-chip data are available. However, a NCA scheme is not applicable to many biological studies due to limited topology information available, such as lack of ChIP-on-chip data. We propose a new approach, motif-directed NCA (mNCA), to integrate motif information and gene expression data to infer regulatory networks.
We develop motif-directed NCA (mNCA) to incorporate motif information into NCA for regulatory network inference. While motif information is readily available from knowledge databases, it is a "noisy" source of network topology information consisting of many false positives. To overcome this problem, we develop a stability analysis procedure embedded in mNCA to resolve the inconsistency between motif information and gene expression data, and to enable the identification of stable TFAs. The mNCA approach has been applied to a time course microarray data set of muscle regeneration. The experimental results show that the inferred TFAs are not only numerically stable but also biologically relevant to muscle differentiation process. In particular, several inferred TFAs like those of MyoD, myogenin and YY1 are well supported by biological experiments.
A novel computational approach, mNCA, has been developed to integrate motif information and gene expression data for regulatory network reconstruction. Specifically, motif analysis is used to obtain initial network topology, and stability analysis is developed and applied with mNCA to extract stable TFAs. Experimental results on muscle regeneration microarray data have demonstrated that mNCA is a practical and reliable computational method for regulatory network inference and pathway discovery.
当微阵列数据和芯片上芯片(ChIP-on-chip)数据都可用时,网络组件分析(NCA)已显示出其在发现调控因子和推断转录因子活性(TFA)方面的有效性。然而,由于可用的拓扑信息有限,例如缺乏芯片上芯片数据,NCA方案不适用于许多生物学研究。我们提出了一种新方法,即基序导向的NCA(mNCA),以整合基序信息和基因表达数据来推断调控网络。
我们开发了基序导向的NCA(mNCA),将基序信息纳入NCA以进行调控网络推断。虽然基序信息可从知识数据库中轻松获得,但它是网络拓扑信息的一个“嘈杂”来源,包含许多假阳性。为克服此问题,我们在mNCA中开发了一种稳定性分析程序,以解决基序信息与基因表达数据之间的不一致,并能够识别稳定的TFA。mNCA方法已应用于肌肉再生的时间进程微阵列数据集。实验结果表明,推断出的TFA不仅在数值上稳定,而且在生物学上与肌肉分化过程相关。特别是,一些推断出的TFA,如MyoD、肌细胞生成素和YY1的TFA,得到了生物学实验的充分支持。
已开发出一种新颖的计算方法mNCA,用于整合基序信息和基因表达数据以进行调控网络重建。具体而言,基序分析用于获得初始网络拓扑,并且开发了稳定性分析并将其与mNCA一起应用以提取稳定的TFA。关于肌肉再生微阵列数据的实验结果表明,mNCA是一种用于调控网络推断和通路发现的实用且可靠的计算方法。