从确定性到随机性：一种用于投资组合优化的可解释的无模型随机强化学习框架。

From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization.

作者信息

Song Zitao, Wang Yining, Qian Pin, Song Sifan, Coenen Frans, Jiang Zhengyong, Su Jionglong

机构信息

Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China.

Department of Computer Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China.

出版信息

Appl Intell (Dordr). 2023;53(12):15188-15203. doi: 10.1007/s10489-022-04217-5. Epub 2022 Nov 11.

DOI:10.1007/s10489-022-04217-5

PMID:36405345

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9651127/

Abstract

As a fundamental problem in algorithmic trading, portfolio optimization aims to maximize the cumulative return by continuously investing in various financial derivatives within a given time period. Recent years have witnessed the transformation from traditional machine learning trading algorithms to reinforcement learning algorithms due to their superior nature of sequential decision making. However, the exponential growth of the imperfect and noisy financial data that is supposedly leveraged by the deterministic strategy in reinforcement learning, makes it increasingly challenging for one to continuously obtain a profitable portfolio. Thus, in this work, we first reconstruct several deterministic and stochastic reinforcement algorithms as benchmarks. On this basis, we introduce a risk-aware reward function to balance the risk and return. Importantly, we propose a novel interpretable stochastic reinforcement learning framework which tailors a stochastic policy parameterized by Gaussian Mixtures and a distributional critic realized by quantiles for the problem of portfolio optimization. In our experiment, the proposed algorithm demonstrates its superior performance on U.S. market stocks with a 63.1% annual rate of return while at the same time reducing the market value max drawdown by 10% when back-testing during the stock market crash around March 2020.

摘要

作为算法交易中的一个基本问题，投资组合优化旨在通过在给定时间段内持续投资于各种金融衍生品来最大化累积回报。近年来，由于强化学习算法在序列决策方面具有优越性，出现了从传统机器学习交易算法向强化学习算法的转变。然而，强化学习中确定性策略所利用的不完美且有噪声的金融数据呈指数级增长，使得人们越来越难以持续获得盈利的投资组合。因此，在这项工作中，我们首先重构了几种确定性和随机强化算法作为基准。在此基础上，我们引入了一个风险感知奖励函数来平衡风险和回报。重要的是，我们提出了一种新颖的可解释随机强化学习框架，该框架针对投资组合优化问题，定制了一个由高斯混合参数化的随机策略和一个由分位数实现的分布评论家。在我们的实验中，所提出的算法在对美国市场股票进行回测时，展示了其卓越的性能，年回报率为63.1%，同时在2020年3月左右股市暴跌期间，将市值最大回撤降低了10%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/9651127/7de46c4fd626/10489_2022_4217_Fig1_HTML.jpg

相似文献

From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization.从确定性到随机性：一种用于投资组合优化的可解释的无模型随机强化学习框架。

Appl Intell (Dordr). 2023;53(12):15188-15203. doi: 10.1007/s10489-022-04217-5. Epub 2022 Nov 11.

Dynamic stock-decision ensemble strategy based on deep reinforcement learning.基于深度强化学习的动态库存决策集成策略

Appl Intell (Dordr). 2023;53(2):2452-2470. doi: 10.1007/s10489-022-03606-0. Epub 2022 May 9.

Risk-aware multi-armed bandit problem with application to portfolio selection.应用于投资组合选择的风险感知多臂老虎机问题。

R Soc Open Sci. 2017 Nov 15;4(11):171377. doi: 10.1098/rsos.171377. eCollection 2017 Nov.

Curriculum learning empowered reinforcement learning for graph-based portfolio management: Performance optimization and comprehensive analysis.基于图的投资组合管理的课程学习强化学习：性能优化和综合分析。

Neural Netw. 2024 Nov;179:106537. doi: 10.1016/j.neunet.2024.106537. Epub 2024 Jul 14.

Model-based reinforcement learning with non-Gaussian environment dynamics and its application to portfolio optimization.具有非高斯环境动态的基于模型的强化学习及其在投资组合优化中的应用。

Chaos. 2023 Aug 1;33(8). doi: 10.1063/5.0155574.

Multi-agent reinforcement learning approach for hedging portfolio problem.用于套期保值投资组合问题的多智能体强化学习方法。

Soft comput. 2021;25(12):7877-7885. doi: 10.1007/s00500-021-05801-6. Epub 2021 Apr 19.

Management of investment portfolios employing reinforcement learning.运用强化学习的投资组合管理。

PeerJ Comput Sci. 2023 Dec 11;9:e1695. doi: 10.7717/peerj-cs.1695. eCollection 2023.

Optimization of investment strategies through machine learning.通过机器学习优化投资策略。

Heliyon. 2023 May 11;9(5):e16155. doi: 10.1016/j.heliyon.2023.e16155. eCollection 2023 May.

MSPM: A modularized and scalable multi-agent reinforcement learning-based system for financial portfolio management.MSPM：一个基于模块化可扩展多智能体强化学习的金融投资组合管理系统。

PLoS One. 2022 Feb 18;17(2):e0263689. doi: 10.1371/journal.pone.0263689. eCollection 2022.

Stock market optimization amidst the COVID-19 pandemic: Technical analysis, K-means algorithm, and mean-variance model (TAKMV) approach.新冠疫情期间的股票市场优化：技术分析、K均值算法与均值-方差模型（TAKMV）方法

Heliyon. 2023 Jul;9(7):e17577. doi: 10.1016/j.heliyon.2023.e17577. Epub 2023 Jun 22.

引用本文的文献

Management of investment portfolios employing reinforcement learning.运用强化学习的投资组合管理。

PeerJ Comput Sci. 2023 Dec 11;9:e1695. doi: 10.7717/peerj-cs.1695. eCollection 2023.

A Multiscale Recursive Attention Gate Federation Method for Multiple Working Conditions Fault Diagnosis.一种用于多工况故障诊断的多尺度递归注意力门融合方法

Entropy (Basel). 2023 Aug 4;25(8):1165. doi: 10.3390/e25081165.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从确定性到随机性：一种用于投资组合优化的可解释的无模型随机强化学习框架。

From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献