Agrafiotis Dimitris K
3-Dimensional Pharmaceuticals, Inc., 665 Stockton Drive, Exton, Pennsylvania 19341, USA.
J Comput Chem. 2003 Jul 30;24(10):1215-21. doi: 10.1002/jcc.10234.
We introduce stochastic proximity embedding (SPE), a novel self-organizing algorithm for producing meaningful underlying dimensions from proximity data. SPE attempts to generate low-dimensional Euclidean embeddings that best preserve the similarities between a set of related observations. The method starts with an initial configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates so that their distances on the map match more closely their respective proximities. The magnitude of these adjustments is controlled by a learning rate parameter, which decreases during the course of the simulation to avoid oscillatory behavior. Unlike classical multidimensional scaling (MDS) and nonlinear mapping (NLM), SPE scales linearly with respect to sample size, and can be applied to very large data sets that are intractable by conventional embedding procedures. The method is programmatically simple, robust, and convergent, and can be applied to a wide range of scientific problems involving exploratory data analysis and visualization.
我们引入了随机近似嵌入(SPE),这是一种用于从邻近数据中生成有意义的潜在维度的新型自组织算法。SPE试图生成低维欧几里得嵌入,以最好地保留一组相关观测值之间的相似性。该方法从初始配置开始,并通过随机反复选择对象对并调整其坐标,使它们在地图上的距离更紧密地匹配各自的邻近度,从而迭代地优化配置。这些调整的幅度由一个学习率参数控制,该参数在模拟过程中会减小,以避免振荡行为。与经典的多维缩放(MDS)和非线性映射(NLM)不同,SPE相对于样本大小呈线性缩放,并且可以应用于传统嵌入程序难以处理的非常大的数据集。该方法在编程上简单、稳健且收敛,可应用于涉及探索性数据分析和可视化的广泛科学问题。