Suppr超能文献

基于深度强化学习的金融交易多样性驱动知识蒸馏。

Diversity-driven knowledge distillation for financial trading using Deep Reinforcement Learning.

机构信息

School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece.

出版信息

Neural Netw. 2021 Aug;140:193-202. doi: 10.1016/j.neunet.2021.02.026. Epub 2021 Mar 17.

Abstract

Deep Reinforcement Learning (RL) is increasingly used for developing financial trading agents for a wide range of tasks. However, optimizing deep RL agents is notoriously difficult and unstable, especially in noisy financial environments, significantly hindering the performance of trading agents. In this work, we present a novel method that improves the training reliability of DRL trading agents building upon the well-known approach of neural network distillation. In the proposed approach, teacher agents are trained in different subsets of RL environment, thus diversifying the policies they learn. Then student agents are trained using distillation from the trained teachers to guide the training process, allowing for better exploring the solution space, while "mimicking" an existing policy/trading strategy provided by the teacher model. The boost in effectiveness of the proposed method comes from the use of diversified ensembles of teachers trained to perform trading for different currencies. This enables us to transfer the common view regarding the most profitable policy to the student, further improving the training stability in noisy financial environments. In the conducted experiments we find that when applying distillation, constraining the teacher models to be diversified can significantly improve their performance of the final student agents. We demonstrate this by providing an extensive evaluation on various financial trading tasks. Furthermore, we also provide additional experiments in the separate domain of control in games using the Procgen environments in order to demonstrate the generality of the proposed method.

摘要

深度强化学习(RL)越来越多地用于为各种任务开发金融交易代理。然而,优化深度 RL 代理非常困难且不稳定,尤其是在嘈杂的金融环境中,这极大地阻碍了交易代理的性能。在这项工作中,我们提出了一种新的方法,该方法基于神经网络蒸馏的知名方法,提高了 DRL 交易代理的训练可靠性。在提出的方法中,教师代理在 RL 环境的不同子集中进行训练,从而使他们学习的策略多样化。然后,学生代理使用从训练有素的教师那里进行的蒸馏进行训练,以指导训练过程,从而更好地探索解决方案空间,同时“模仿”教师模型提供的现有策略/交易策略。所提出方法的有效性提高来自于使用针对不同货币进行交易的多样化教师集合。这使我们能够将关于最有利可图策略的常见观点转移到学生身上,从而进一步提高嘈杂金融环境中的训练稳定性。在进行的实验中,我们发现当应用蒸馏时,约束教师模型的多样化可以显著提高它们的最终学生代理的性能。我们通过在各种金融交易任务上进行广泛的评估来证明这一点。此外,我们还在使用 Procgen 环境的游戏控制的单独领域中进行了其他实验,以证明所提出方法的通用性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验