Suppr超能文献

孙悟空:基于深度强化学习的大数据平台自适应参数调整。

MonkeyKing: Adaptive Parameter Tuning on Big Data Platforms with Deep Reinforcement Learning.

机构信息

College of Electronics and Information Engineering, Tongji University, Shanghai, China.

School of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, China.

出版信息

Big Data. 2020 Aug;8(4):270-290. doi: 10.1089/big.2019.0123. Epub 2020 Jul 10.

Abstract

Choosing the right parameter configurations for recurring jobs running on big data analytics platforms is difficult because there can be hundreds of possible parameter configurations to pick from. Even the selection of parameter configurations is based on different types of applications and user requirements. The difference between the best configuration and the worst configuration can have a performance impact of more than 10 times. However, parameters of big data platforms are not independent, which makes it a challenge to automatically identify the optimal configuration for a broad spectrum of applications. To alleviate these problems, we proposed MonkeyKing, a system that leverages past experience and collects new information to adjust parameter configurations of big data platforms. It can recommend key parameters, which have strong impact on performance according to job types, and then combine deep reinforcement learning (DRL) to optimize key parameters to improve job performance. We choose the current popular deep Q-network (DQN) structure and its four improved algorithms, including DQN, Double DQN, Dueling DQN, and the combined Double DQN and Dueling DQN, and finally found that the combined Double DQN and Dueling DQN has a better effect. Our experiments and evaluations on Spark show that performance can be improved by ∼25% under best conditions.

摘要

为运行在大数据分析平台上的重复作业选择正确的参数配置是很困难的,因为可能有数百种可能的参数配置可供选择。即使是参数配置的选择也是基于不同类型的应用程序和用户需求。最佳配置和最差配置之间的差异可能会对性能产生超过 10 倍的影响。然而,大数据平台的参数不是独立的,这使得自动识别广泛应用的最佳配置成为一项挑战。为了缓解这些问题,我们提出了 MonkeyKing,这是一个利用以往经验和收集新信息来调整大数据平台参数配置的系统。它可以根据作业类型推荐对性能有强烈影响的关键参数,然后结合深度强化学习(DRL)来优化关键参数,以提高作业性能。我们选择了当前流行的深度 Q 网络(DQN)结构及其四种改进算法,包括 DQN、Double DQN、Dueling DQN 和 Double DQN 与 Dueling DQN 的组合,并最终发现组合的 Double DQN 和 Dueling DQN 效果更好。我们在 Spark 上的实验和评估表明,在最佳条件下,性能可以提高约 25%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验