Qiu Yifu, Qiu Yitao, Yuan Yicong, Chen Zheng, Lee Raymond
Department of Computer Science and Technology, Division of Science and Technology, BNU-HKBU United International College, Zhuhai, China.
Front Artif Intell. 2021 Oct 29;4:749878. doi: 10.3389/frai.2021.749878. eCollection 2021.
Reinforcement Learning (RL) based machine trading attracts a rich profusion of interest. However, in the existing research, RL in the day-trade task suffers from the noisy financial movement in the short time scale, difficulty in order settlement, and expensive action search in a continuous-value space. This paper introduced an end-to-end RL intraday trading agent, namely QF-TraderNet, based on the quantum finance theory (QFT) and deep reinforcement learning. We proposed a novel design for the intraday RL trader's action space, inspired by the Quantum Price Levels (QPLs). Our action space design also brings the model a learnable profit-and-loss control strategy. QF-TraderNet composes two neural networks: 1) A long short term memory networks for the feature learning of financial time series; 2) a policy generator network (PGN) for generating the distribution of actions. The profitability and robustness of QF-TraderNet have been verified in multi-type financial datasets, including FOREX, metals, crude oil, and financial indices. The experimental results demonstrate that QF-TraderNet outperforms other baselines in terms of cumulative price returns and Sharpe Ratio, and the robustness in the acceidential market shift.
基于强化学习(RL)的机器交易引起了广泛关注。然而,在现有研究中,日内交易任务中的强化学习存在短时间尺度下金融波动噪声大、订单结算困难以及连续值空间中动作搜索成本高等问题。本文基于量子金融理论(QFT)和深度强化学习,引入了一种端到端的强化学习日内交易智能体,即QF-TraderNet。受量子价格水平(QPLs)启发,我们为日内强化学习交易者的动作空间提出了一种新颖设计。我们的动作空间设计还为模型带来了一种可学习的盈亏控制策略。QF-TraderNet由两个神经网络组成:1)用于金融时间序列特征学习的长短期记忆网络;2)用于生成动作分布的策略生成器网络(PGN)。QF-TraderNet的盈利能力和稳健性已在包括外汇、金属、原油和金融指数在内的多种类型金融数据集中得到验证。实验结果表明,QF-TraderNet在累积价格回报和夏普比率方面优于其他基线,并且在意外市场变化中具有稳健性。