Castro José, Georgiopoulos Michael, Demara Ronald, Gonzalez Avelino
Department of Electrical and Computer Engineering, University of Central Florida, 4000 Central Florida Blvd. Engineering Building 1, Suite 407, Orlando, FL 32816-2786, USA.
Neural Netw. 2005 Sep;18(7):967-84. doi: 10.1016/j.neunet.2005.01.007.
The Fuzzy ARTMAP algorithm has been proven to be one of the premier neural network architectures for classification problems. One of the properties of Fuzzy ARTMAP, which can be both an asset and a liability, is its capacity to produce new nodes (templates) on demand to represent classification categories. This property allows Fuzzy ARTMAP to automatically adapt to the database without having to a priori specify its network size. On the other hand, it has the undesirable side effect that large databases might produce a large network size (node proliferation) that can dramatically slow down the training speed of the algorithm. To address the slow convergence speed of Fuzzy ARTMAP for large database problems, we propose the use of space-filling curves, specifically the Hilbert space-filling curves (HSFC). Hilbert space-filling curves allow us to divide the problem into smaller sub-problems, each focusing on a smaller than the original dataset. For learning each partition of data, a different Fuzzy ARTMAP network is used. Through this divide-and-conquer approach we are avoiding the node proliferation problem, and consequently we speedup Fuzzy ARTMAP's training. Results have been produced for a two-class, 16-dimensional Gaussian data, and on the Forest database, available at the UCI repository. Our results indicate that the Hilbert space-filling curve approach reduces the time that it takes to train Fuzzy ARTMAP without affecting the generalization performance attained by Fuzzy ARTMAP trained on the original large dataset. Given that the resulting smaller datasets that the HSFC approach produces can independently be learned by different Fuzzy ARTMAP networks, we have also implemented and tested a parallel implementation of this approach on a Beowulf cluster of workstations that further speeds up Fuzzy ARTMAP's convergence to a solution for large database problems.
模糊ARTMAP算法已被证明是用于分类问题的首要神经网络架构之一。模糊ARTMAP的特性之一,它既可以是一项资产也可以是一项负债,就是它能够按需生成新节点(模板)来表示分类类别。此特性使模糊ARTMAP能够自动适应数据库,而无需事先指定其网络大小。另一方面,它有一个不良副作用,即大型数据库可能会产生较大的网络规模(节点增殖),这会显著减慢算法的训练速度。为了解决模糊ARTMAP在处理大型数据库问题时收敛速度慢的问题,我们建议使用空间填充曲线,特别是希尔伯特空间填充曲线(HSFC)。希尔伯特空间填充曲线使我们能够将问题分解为较小的子问题,每个子问题关注的数据集都比原始数据集小。对于学习数据的每个分区,使用不同的模糊ARTMAP网络。通过这种分而治之的方法,我们避免了节点增殖问题,从而加快了模糊ARTMAP的训练速度。我们已经针对两类16维高斯数据以及UCI存储库中可用的森林数据库得出了结果。我们的结果表明,希尔伯特空间填充曲线方法减少了训练模糊ARTMAP所需的时间,同时不影响在原始大型数据集上训练的模糊ARTMAP所达到的泛化性能。鉴于HSFC方法产生的较小数据集可以由不同的模糊ARTMAP网络独立学习,我们还在由工作站组成的Beowulf集群上实现并测试了此方法的并行实现,这进一步加快了模糊ARTMAP对大型数据库问题的收敛速度以找到解决方案。