Wong Jason W H, Cartwright Hugh M
Physical and Theoretical Chemistry Laboratory, Department of Chemistry, Oxford University, South Parks Road, Oxford OX1 3QZ, UK.
J Biomed Inform. 2005 Aug;38(4):322-30. doi: 10.1016/j.jbi.2005.02.002. Epub 2005 Mar 5.
Recent advances in clinical proteomics data acquisition have led to the generation of datasets of high complexity and dimensionality. We present here a visualization method for high-dimensionality datasets that makes use of neuronal vectors of a trained growing cell structure (GCS) network for the projection of data points onto two dimensions. The use of a GCS network enables the generation of the projection matrix deterministically rather than randomly as in random projection. Three datasets were used to benchmark the performance and to demonstrate the use of this deterministic projection approach in real-life scientific applications. Comparisons are made to an existing self-organizing map projection method and random projection. The results suggest that deterministic projection outperforms existing methods and is suitable for the visualization of datasets of very high dimensionality.
临床蛋白质组学数据采集的最新进展已导致生成了具有高复杂性和维度的数据集。我们在此展示一种针对高维数据集的可视化方法,该方法利用经过训练的生长细胞结构(GCS)网络的神经元向量将数据点投影到二维空间。使用GCS网络能够确定性地而非像随机投影那样随机生成投影矩阵。使用了三个数据集来评估性能,并展示这种确定性投影方法在实际科学应用中的用途。与现有的自组织映射投影方法和随机投影进行了比较。结果表明,确定性投影优于现有方法,适用于非常高维数据集的可视化。