School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China.
Jiangsu Vocational College of Finance and Economics, Huai'an, Jiangsu 223003, China.
Comput Intell Neurosci. 2022 Oct 7;2022:1545024. doi: 10.1155/2022/1545024. eCollection 2022.
Considering that in the process of job scheduling, the cluster load should be prebalanced rather than remedied when the load is seriously unbalanced; therefore, in this paper, the task scheduling flow of the Hadoop cluster is analyzed deeply. On the Hadoop platform, a self-dividing algorithm is proposed for load balancing. An intelligent optimization algorithm is used to solve load balance. A dynamic feedback load balancing scheduling method is proposed from the point of view of task scheduling. In order to solve the shortcoming of the fair scheduling algorithm, this paper proposes two ways to improve the resource utilization and overall performance of Hadoop. When the mapping task is completed and the tasks to be reduced are assigned, the task assignment is based on the performance of the nodes to be reduced. It gives full play to the advantages of the ant colony algorithm and the hive colony algorithm so that the fusion algorithm can better deal with load balance. Then, three existing scheduling algorithms are introduced in detail: single queue scheduling, capacity scheduling, and fair scheduling. On this basis, an improved task scheduling strategy based on genetic algorithm is proposed to allocate and execute application tasks to reduce task completion time. The experiment verifies the effectiveness of the algorithm. The LBNP algorithm greatly improves the efficiency of reducing task execution and job execution. The delay capacity scheduling algorithm can ensure that most tasks can achieve localization scheduling, improve resource utilization, improve load balance, and speed up job completion time.
考虑到在作业调度过程中,集群负载应在负载严重失衡之前进行预平衡,而不是在负载严重失衡时进行补救;因此,本文深入分析了 Hadoop 集群的任务调度流程。在 Hadoop 平台上,提出了一种用于负载均衡的自划分算法。采用智能优化算法解决负载均衡问题。从任务调度的角度出发,提出了一种动态反馈的负载均衡调度方法。为了解决公平调度算法的缺点,本文提出了两种提高 Hadoop 资源利用率和整体性能的方法。当映射任务完成且要减少的任务被分配时,任务分配基于要减少的节点的性能。它充分发挥了蚁群算法和蜂群算法的优势,使融合算法能够更好地处理负载平衡问题。然后,详细介绍了三种现有的调度算法:单队列调度、容量调度和公平调度。在此基础上,提出了一种基于遗传算法的改进任务调度策略,以分配和执行应用任务,从而减少任务完成时间。实验验证了算法的有效性。LBNP 算法大大提高了任务执行和作业执行的效率。延迟容量调度算法可以确保大多数任务实现本地化调度,提高资源利用率,实现负载平衡,加快作业完成时间。