Liao James C, Boscolo Riccardo, Yang Young-Lyeol, Tran Linh My, Sabatti Chiara, Roychowdhury Vwani P
Departments of Chemical Engineering, University of California, Los Angeles, CA 90095, USA.
Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15522-7. doi: 10.1073/pnas.2136632100. Epub 2003 Dec 12.
High-dimensional data sets generated by high-throughput technologies, such as DNA microarray, are often the outputs of complex networked systems driven by hidden regulatory signals. Traditional statistical methods for computing low-dimensional or hidden representations of these data sets, such as principal component analysis and independent component analysis, ignore the underlying network structures and provide decompositions based purely on a priori statistical constraints on the computed component signals. The resulting decomposition thus provides a phenomenological model for the observed data and does not necessarily contain physically or biologically meaningful signals. Here, we develop a method, called network component analysis, for uncovering hidden regulatory signals from outputs of networked systems, when only a partial knowledge of the underlying network topology is available. The a priori network structure information is first tested for compliance with a set of identifiability criteria. For networks that satisfy the criteria, the signals from the regulatory nodes and their strengths of influence on each output node can be faithfully reconstructed. This method is first validated experimentally by using the absorbance spectra of a network of various hemoglobin species. The method is then applied to microarray data generated from yeast Saccharamyces cerevisiae and the activities of various transcription factors during cell cycle are reconstructed by using recently discovered connectivity information for the underlying transcriptional regulatory networks.
由高通量技术(如DNA微阵列)生成的高维数据集,通常是由隐藏的调控信号驱动的复杂网络系统的输出。用于计算这些数据集的低维或隐藏表示的传统统计方法,如主成分分析和独立成分分析,忽略了潜在的网络结构,并且仅基于对计算出的成分信号的先验统计约束来提供分解。因此,得到的分解为观测数据提供了一个唯象模型,并不一定包含物理上或生物学上有意义的信号。在这里,我们开发了一种称为网络成分分析的方法,用于从网络系统的输出中揭示隐藏的调控信号,前提是仅知道潜在网络拓扑的部分信息。先验网络结构信息首先要根据一组可识别性标准进行检验。对于符合这些标准的网络,可以忠实地重建来自调控节点的信号及其对每个输出节点的影响强度。该方法首先通过使用各种血红蛋白物种网络的吸收光谱进行实验验证。然后将该方法应用于酿酒酵母产生的微阵列数据,并利用最近发现的潜在转录调控网络的连接信息重建细胞周期中各种转录因子的活性。