Ebbels Timothy M D, Buxton Bernard F, Jones David T
Bioinformatics Unit, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT.
Bioinformatics. 2006 Jul 15;22(14):e99-107. doi: 10.1093/bioinformatics/btl205.
The interpretation of microarray and other high-throughput data is highly dependent on the biological context of experiments. However, standard analysis packages are poor at simultaneously presenting both the array and related bioinformatic data. We have addressed this challenge by developing a system springScape based on 'spring embedding' and an 'information landscape' allowing several related data sources to be dynamically combined while highlighting one particular feature. Each data source is represented as a network of nodes connected by weighted edges. The networks are combined and embedded in the 2-D plane by spring embedding such that nodes with a high similarity are drawn close together. Complex relationships can be discovered by varying the weight of each data source and observing the dynamic response of the spring network. By modifying Procrustes analysis, we find that the visualizations have an acceptable degree of reproducibility. The 'information landscape' highlights one particular data source, displaying it as a smooth surface whose height is proportional to both the information being viewed and the density of nodes. The algorithm is demonstrated using several microarray data sets in combination with protein-protein interaction data and GO annotations. Among the features revealed are the spatio-temporal profile of gene expression and the identification of GO terms correlated with gene expression and protein interactions. The power of this combined display lies in its interactive feedback and exploitation of human visual pattern recognition. Overall, springScape shows promise as a tool for the interpretation of microarray data in the context of relevant bioinformatic information.
微阵列及其他高通量数据的解读高度依赖于实验的生物学背景。然而,标准分析软件包在同时呈现阵列数据和相关生物信息数据方面表现欠佳。我们通过开发一个基于“弹簧嵌入”和“信息景观”的系统springScape来应对这一挑战,该系统允许动态整合多个相关数据源,同时突出显示一个特定特征。每个数据源都表示为一个由加权边连接的节点网络。通过弹簧嵌入将这些网络组合并嵌入到二维平面中,使得相似度高的节点被绘制得靠近在一起。通过改变每个数据源的权重并观察弹簧网络的动态响应,可以发现复杂的关系。通过修改普罗克汝斯忒斯分析,我们发现这些可视化具有可接受程度的可重复性。“信息景观”突出显示一个特定的数据源,将其显示为一个平滑表面,其高度与所查看的信息以及节点密度成正比。使用几个微阵列数据集结合蛋白质-蛋白质相互作用数据和基因本体注释对该算法进行了演示。所揭示的特征包括基因表达的时空分布以及与基因表达和蛋白质相互作用相关的基因本体术语的识别。这种组合显示的强大之处在于其交互式反馈以及对人类视觉模式识别的利用。总体而言,springScape有望成为在相关生物信息背景下解读微阵列数据的工具。