• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

STDADS:一种用于截止期调度器的高效慢速任务检测算法。

STDADS: An Efficient Slow Task Detection Algorithm for Deadline Schedulers.

机构信息

National Institute of Technology, Jalandhar, India.

出版信息

Big Data. 2020 Feb;8(1):62-69. doi: 10.1089/big.2019.0039. Epub 2020 Jan 29.

DOI:10.1089/big.2019.0039
PMID:31995397
Abstract

The MapReduce programming model was designed and developed for Google File System to efficiently process large-scale distributed data sets. The open source implementation of this Google project was called the Apache Hadoop. Hadoop architecture includes Hadoop MapReduce and Hadoop Distributed File System (HDFS). HDFS supports Hadoop in effectively managing data sets over the cluster and MapReduce programming paradigm helps in the efficient processing of large data sets. MapReduce strategically re-executes a speculative task on some other node to finish the computation quickly, enhancing the overall Quality of Service (QoS). Several mechanisms were suggested over the Hadoop's Default Scheduler to improve the speculative task execution over Hadoop cluster. A large number of strategies were also suggested for scheduling jobs with deadlines. The mechanisms for speculative task execution were not developed for or were not well integrated with Deadline Schedulers. This article presents an improved speculative task detection algorithm, designed specifically for Deadline Scheduler. Our studies suggest the importance of keeping a regular track of node's performance to re-execute the speculative tasks more efficiently. We have successfully improved the QoS offered by Hadoop clusters over the jobs arriving with deadlines in terms of the percentage of successfully completed jobs, the detection time of speculative tasks, the accuracy of correct speculative task detection, and the percentage of incorrectly fagged speculative tasks.

摘要

MapReduce 编程模型是为 Google 文件系统设计和开发的,用于有效地处理大规模分布式数据集。这个 Google 项目的开源实现被称为 Apache Hadoop。Hadoop 架构包括 Hadoop MapReduce 和 Hadoop 分布式文件系统(HDFS)。HDFS 支持 Hadoop 在集群中有效地管理数据集,而 MapReduce 编程范式有助于高效处理大数据集。MapReduce 策略性地在其他节点上重新执行推测性任务,以快速完成计算,从而提高整体服务质量(QoS)。在 Hadoop 的默认调度器上提出了几种机制来改进 Hadoop 集群上的推测性任务执行。还提出了许多具有截止日期的作业调度策略。推测性任务执行的机制不是为截止日期调度器设计的,也没有很好地集成到截止日期调度器中。本文提出了一种针对截止日期调度器的改进的推测性任务检测算法。我们的研究表明,定期跟踪节点性能以更有效地重新执行推测性任务非常重要。我们已经成功地提高了具有截止日期的作业到达时 Hadoop 集群提供的 QoS,具体体现在成功完成的作业百分比、推测性任务的检测时间、正确推测性任务检测的准确性以及错误标记的推测性任务的百分比。

相似文献

1
STDADS: An Efficient Slow Task Detection Algorithm for Deadline Schedulers.STDADS:一种用于截止期调度器的高效慢速任务检测算法。
Big Data. 2020 Feb;8(1):62-69. doi: 10.1089/big.2019.0039. Epub 2020 Jan 29.
2
How Heterogeneity Affects the Design of Hadoop MapReduce Schedulers: A State-of-the-Art Survey and Challenges.Hadoop MapReduce 调度器设计中的异构性影响:现状调查与挑战
Big Data. 2018 Jun;6(2):72-95. doi: 10.1089/big.2018.0013.
3
MRS-DP: Improving Performance and Resource Utilization of Big Data Applications with Deadlines and Priorities.MRS-DP:利用截止日期和优先级提高大数据应用的性能和资源利用率。
Big Data. 2020 Aug;8(4):323-331. doi: 10.1089/big.2020.0081.
4
Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.MapReduce 编程框架在临床大数据分析中的应用:现状与未来趋势。
BioData Min. 2014 Oct 29;7:22. doi: 10.1186/1756-0381-7-22. eCollection 2014.
5
CloudDOE: a user-friendly tool for deploying Hadoop clouds and analyzing high-throughput sequencing data with MapReduce.CloudDOE:一款用于部署Hadoop云并使用MapReduce分析高通量测序数据的用户友好型工具。
PLoS One. 2014 Jun 4;9(6):e98146. doi: 10.1371/journal.pone.0098146. eCollection 2014.
6
Using Hadoop MapReduce for Parallel Genetic Algorithms: A Comparison of the Global, Grid and Island Models.使用Hadoop MapReduce实现并行遗传算法:全局模型、网格模型和孤岛模型的比较
Evol Comput. 2018 Winter;26(4):535-567. doi: 10.1162/evco_a_00213. Epub 2017 Jun 29.
7
Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment.异构分布式环境中运行时任务执行时间的估计准确性
Sensors (Basel). 2016 Aug 30;16(9):1386. doi: 10.3390/s16091386.
8
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.Hadoop/MapReduce/HBase 框架概述及其在生物信息学中的当前应用。
BMC Bioinformatics. 2010 Dec 21;11 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-11-S12-S1.
9
Medical Cloud Computing Data Processing to Optimize the Effect of Drugs.医疗云计算数据处理优化药物效果。
J Healthc Eng. 2021 Mar 19;2021:5560691. doi: 10.1155/2021/5560691. eCollection 2021.
10
MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce.MRPack:在MapReduce中使用计算密集型方法的多算法执行
PLoS One. 2015 Aug 25;10(8):e0136259. doi: 10.1371/journal.pone.0136259. eCollection 2015.