Fang Yuling, Chen Qingkui, Xiong Neal N, Zhao Deyu, Wang Jingjuan
.University of Shanghai for Science and Technology, Shanghai 200093, China.
.Department of Mathematics and Computer Science, Northeastern State University, Tahlequah, OK 74464, USA.
Sensors (Basel). 2017 Aug 4;17(8):1799. doi: 10.3390/s17081799.
This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
本文旨在开发一种低成本、高性能和高可靠性的计算系统,以便在物联网(IoT)计算环境中使用常见的数据挖掘算法来处理大规模数据。考虑到物联网数据处理的特点,类似于主流的高性能计算,我们使用图形处理单元(GPU)集群来实现更好的物联网服务。首先,我们提出一种基于无线传感器网络(WSN)的能耗计算方法(ECCM)。然后,使用统一计算设备架构(CUDA)编程模型,我们提出一种两级并行优化模型(TLPOM),该模型利用合理的资源规划和常见的编译器优化技术,在考虑每个节点资源约束的情况下获得最佳的块和线程配置。这部分的关键在于动态耦合线程级并行(TLP)和指令级并行(ILP),以在不增加能耗的情况下提高算法性能。最后,结合ECCM和TLPOM,我们使用可靠的GPU集群架构(RGCA),在考虑节点多样性、算法特性等因素的情况下获得一个高可靠性的计算系统。结果表明,使用TLPOM时,Fermi、Kepler和Maxwell算法的性能平均显著提高了34.1%、33.96%和24.07%,并且RGCA确保我们的物联网计算系统提供低成本和高可靠性的服务。