Suppr超能文献

STDADS:一种用于截止期调度器的高效慢速任务检测算法。

STDADS: An Efficient Slow Task Detection Algorithm for Deadline Schedulers.

机构信息

National Institute of Technology, Jalandhar, India.

出版信息

Big Data. 2020 Feb;8(1):62-69. doi: 10.1089/big.2019.0039. Epub 2020 Jan 29.

Abstract

The MapReduce programming model was designed and developed for Google File System to efficiently process large-scale distributed data sets. The open source implementation of this Google project was called the Apache Hadoop. Hadoop architecture includes Hadoop MapReduce and Hadoop Distributed File System (HDFS). HDFS supports Hadoop in effectively managing data sets over the cluster and MapReduce programming paradigm helps in the efficient processing of large data sets. MapReduce strategically re-executes a speculative task on some other node to finish the computation quickly, enhancing the overall Quality of Service (QoS). Several mechanisms were suggested over the Hadoop's Default Scheduler to improve the speculative task execution over Hadoop cluster. A large number of strategies were also suggested for scheduling jobs with deadlines. The mechanisms for speculative task execution were not developed for or were not well integrated with Deadline Schedulers. This article presents an improved speculative task detection algorithm, designed specifically for Deadline Scheduler. Our studies suggest the importance of keeping a regular track of node's performance to re-execute the speculative tasks more efficiently. We have successfully improved the QoS offered by Hadoop clusters over the jobs arriving with deadlines in terms of the percentage of successfully completed jobs, the detection time of speculative tasks, the accuracy of correct speculative task detection, and the percentage of incorrectly fagged speculative tasks.

摘要

MapReduce 编程模型是为 Google 文件系统设计和开发的,用于有效地处理大规模分布式数据集。这个 Google 项目的开源实现被称为 Apache Hadoop。Hadoop 架构包括 Hadoop MapReduce 和 Hadoop 分布式文件系统(HDFS)。HDFS 支持 Hadoop 在集群中有效地管理数据集,而 MapReduce 编程范式有助于高效处理大数据集。MapReduce 策略性地在其他节点上重新执行推测性任务,以快速完成计算,从而提高整体服务质量(QoS)。在 Hadoop 的默认调度器上提出了几种机制来改进 Hadoop 集群上的推测性任务执行。还提出了许多具有截止日期的作业调度策略。推测性任务执行的机制不是为截止日期调度器设计的,也没有很好地集成到截止日期调度器中。本文提出了一种针对截止日期调度器的改进的推测性任务检测算法。我们的研究表明,定期跟踪节点性能以更有效地重新执行推测性任务非常重要。我们已经成功地提高了具有截止日期的作业到达时 Hadoop 集群提供的 QoS,具体体现在成功完成的作业百分比、推测性任务的检测时间、正确推测性任务检测的准确性以及错误标记的推测性任务的百分比。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验