Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA.
Phys Biol. 2012 Aug;9(4):045004. doi: 10.1088/1478-3975/9/4/045004. Epub 2012 Aug 7.
Cellular signal transduction is coordinated by modifications of many proteins within cells. Protein modifications are not independent, because some are connected through shared signaling cascades and others jointly converge upon common cellular functions. This coupling creates a hidden structure within a signaling network that can point to higher level organizing principles of interest to systems biology. One can identify important covariations within large-scale datasets by using mathematical models that extract latent dimensions-the key structural elements of a measurement set. In this paper, we introduce two principal component-based methods for identifying and interpreting latent dimensions. Principal component analysis provides a starting point for unbiased inspection of the major sources of variation within a dataset. Partial least-squares regression reorients these dimensions toward a specific hypothesis of interest. Both approaches have been used widely in studies of cell signaling, and they should be standard analytical tools once highly multivariate datasets become straightforward to accumulate.
细胞信号转导是通过细胞内许多蛋白质的修饰来协调的。蛋白质修饰不是独立的,因为一些通过共享的信号级联连接,而另一些则共同集中在共同的细胞功能上。这种耦合在信号网络中创建了一个隐藏的结构,它可以指向系统生物学感兴趣的更高层次的组织原则。通过使用数学模型,可以从大型数据集识别重要的协变,这些模型提取潜在维度——测量集的关键结构元素。在本文中,我们介绍了两种基于主成分的方法,用于识别和解释潜在维度。主成分分析为无偏检查数据集内主要变化源提供了一个起点。偏最小二乘回归将这些维度重新定向到特定的感兴趣假设。这两种方法在细胞信号研究中都得到了广泛的应用,一旦高度多变量数据集变得易于积累,它们就应该成为标准的分析工具。