Suppr超能文献

从确定性到随机性:一种用于投资组合优化的可解释的无模型随机强化学习框架。

From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization.

作者信息

Song Zitao, Wang Yining, Qian Pin, Song Sifan, Coenen Frans, Jiang Zhengyong, Su Jionglong

机构信息

Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China.

Department of Computer Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China.

出版信息

Appl Intell (Dordr). 2023;53(12):15188-15203. doi: 10.1007/s10489-022-04217-5. Epub 2022 Nov 11.

Abstract

As a fundamental problem in algorithmic trading, portfolio optimization aims to maximize the cumulative return by continuously investing in various financial derivatives within a given time period. Recent years have witnessed the transformation from traditional machine learning trading algorithms to reinforcement learning algorithms due to their superior nature of sequential decision making. However, the exponential growth of the imperfect and noisy financial data that is supposedly leveraged by the deterministic strategy in reinforcement learning, makes it increasingly challenging for one to continuously obtain a profitable portfolio. Thus, in this work, we first reconstruct several deterministic and stochastic reinforcement algorithms as benchmarks. On this basis, we introduce a risk-aware reward function to balance the risk and return. Importantly, we propose a novel interpretable stochastic reinforcement learning framework which tailors a stochastic policy parameterized by Gaussian Mixtures and a distributional critic realized by quantiles for the problem of portfolio optimization. In our experiment, the proposed algorithm demonstrates its superior performance on U.S. market stocks with a 63.1% annual rate of return while at the same time reducing the market value max drawdown by 10% when back-testing during the stock market crash around March 2020.

摘要

作为算法交易中的一个基本问题,投资组合优化旨在通过在给定时间段内持续投资于各种金融衍生品来最大化累积回报。近年来,由于强化学习算法在序列决策方面具有优越性,出现了从传统机器学习交易算法向强化学习算法的转变。然而,强化学习中确定性策略所利用的不完美且有噪声的金融数据呈指数级增长,使得人们越来越难以持续获得盈利的投资组合。因此,在这项工作中,我们首先重构了几种确定性和随机强化算法作为基准。在此基础上,我们引入了一个风险感知奖励函数来平衡风险和回报。重要的是,我们提出了一种新颖的可解释随机强化学习框架,该框架针对投资组合优化问题,定制了一个由高斯混合参数化的随机策略和一个由分位数实现的分布评论家。在我们的实验中,所提出的算法在对美国市场股票进行回测时,展示了其卓越的性能,年回报率为63.1%,同时在2020年3月左右股市暴跌期间,将市值最大回撤降低了10%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/9651127/7de46c4fd626/10489_2022_4217_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验