使用MapReduce和级联模型并行化反向传播神经网络

Parallelizing Backpropagation Neural Network Using MapReduce and Cascading Model.

作者信息

Liu Yang, Jing Weizhe, Xu Lixiong

机构信息

School of Electrical Engineering and Information, Sichuan University, Chengdu 610065, China.

出版信息

Comput Intell Neurosci. 2016;2016:2842780. doi: 10.1155/2016/2842780. Epub 2016 Apr 27.

DOI:10.1155/2016/2842780

PMID:27217823

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4863083/

Abstract

Artificial Neural Network (ANN) is a widely used algorithm in pattern recognition, classification, and prediction fields. Among a number of neural networks, backpropagation neural network (BPNN) has become the most famous one due to its remarkable function approximation ability. However, a standard BPNN frequently employs a large number of sum and sigmoid calculations, which may result in low efficiency in dealing with large volume of data. Therefore to parallelize BPNN using distributed computing technologies is an effective way to improve the algorithm performance in terms of efficiency. However, traditional parallelization may lead to accuracy loss. Although several complements have been done, it is still difficult to find out a compromise between efficiency and precision. This paper presents a parallelized BPNN based on MapReduce computing model which supplies advanced features including fault tolerance, data replication, and load balancing. And also to improve the algorithm performance in terms of precision, this paper creates a cascading model based classification approach, which helps to refine the classification results. The experimental results indicate that the presented parallelized BPNN is able to offer high efficiency whilst maintaining excellent precision in enabling large-scale machine learning.

摘要

人工神经网络（ANN）是模式识别、分类和预测领域中广泛使用的算法。在众多神经网络中，反向传播神经网络（BPNN）因其卓越的函数逼近能力而成为最著名的一种。然而，标准的BPNN经常进行大量的求和和Sigmoid计算，这可能导致处理大量数据时效率低下。因此，使用分布式计算技术对BPNN进行并行化是提高算法效率的有效方法。然而，传统的并行化可能会导致精度损失。尽管已经进行了一些补充，但仍然很难在效率和精度之间找到折衷方案。本文提出了一种基于MapReduce计算模型的并行化BPNN，该模型提供了包括容错、数据复制和负载均衡在内的高级功能。为了在精度方面提高算法性能，本文创建了一种基于级联模型的分类方法，有助于细化分类结果。实验结果表明，所提出的并行化BPNN能够在实现大规模机器学习时提供高效率，同时保持优异的精度。