用于多目标优化的深度强化学习

Deep Reinforcement Learning for Multiobjective Optimization.

作者信息

Li Kaiwen, Zhang Tao, Wang Rui

出版信息

IEEE Trans Cybern. 2021 Jun;51(6):3103-3114. doi: 10.1109/TCYB.2020.2977661. Epub 2021 May 18.

DOI:10.1109/TCYB.2020.2977661

PMID:32191907

Abstract

This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.

摘要

本文提出了一种使用深度强化学习（DRL）解决多目标优化问题（MOP）的端到端框架，我们称之为基于DRL的多目标优化算法（DRL-MOA）。采用分解思想将多目标优化问题分解为一组标量优化子问题。然后，将每个子问题建模为一个神经网络。所有子问题的模型参数根据基于邻域的参数传递策略和深度强化学习训练算法进行协同优化。通过训练后的神经网络模型可以直接获得帕累托最优解。具体而言，本文使用DRL-MOA方法，通过将子问题建模为指针网络来解决多目标旅行商问题（MOTSP）。进行了大量实验来研究DRL-MOA，并将其与各种基准方法进行比较。结果发现，一旦获得训练好的模型，它可以扩展到新遇到的问题，而无需重新训练模型。通过神经网络的简单前向计算就可以直接获得解决方案；因此，无需迭代，并且多目标优化问题总能在合理的时间内得到解决。所提出的方法为利用深度强化学习解决多目标优化问题提供了一种新途径。与现有的多目标优化方法相比，它展现出了一系列新特性，例如强大的泛化能力和快速的求解速度。实验结果表明了该方法在模型性能和运行时间方面的有效性和竞争力。

相似文献

Deep Reinforcement Learning for Multiobjective Optimization.用于多目标优化的深度强化学习

IEEE Trans Cybern. 2021 Jun;51(6):3103-3114. doi: 10.1109/TCYB.2020.2977661. Epub 2021 May 18.

A deep reinforcement learning algorithm framework for solving multi-objective traveling salesman problem based on feature transformation.基于特征变换的求解多目标旅行商问题的深度强化学习算法框架。

Neural Netw. 2024 Aug;176:106359. doi: 10.1016/j.neunet.2024.106359. Epub 2024 May 3.

Meta-Learning-Based Deep Reinforcement Learning for Multiobjective Optimization Problems.基于元学习的深度强化学习用于多目标优化问题

IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7978-7991. doi: 10.1109/TNNLS.2022.3148435. Epub 2023 Oct 6.

Multiobjective Combinatorial Optimization Using a Single Deep Reinforcement Learning Model.使用单一深度强化学习模型的多目标组合优化

IEEE Trans Cybern. 2024 Mar;54(3):1984-1996. doi: 10.1109/TCYB.2023.3312476. Epub 2024 Feb 9.

An Improved Multiobjective Optimization Evolutionary Algorithm Based on Decomposition for Complex Pareto Fronts.基于分解的复杂 Pareto 前沿改进多目标优化进化算法。

IEEE Trans Cybern. 2016 Feb;46(2):421-37. doi: 10.1109/TCYB.2015.2403131. Epub 2015 Mar 13.

The Collaborative Local Search Based on Dynamic-Constrained Decomposition With Grids for Combinatorial Multiobjective Optimization.基于带网格的动态约束分解的协同局部搜索用于组合多目标优化

IEEE Trans Cybern. 2021 May;51(5):2639-2650. doi: 10.1109/TCYB.2019.2931434. Epub 2021 Apr 15.

Interrelationship-Based Selection for Decomposition Multiobjective Optimization.基于关联度的分解多目标优化选择。

IEEE Trans Cybern. 2015 Oct;45(10):2076-88. doi: 10.1109/TCYB.2014.2365354. Epub 2014 Dec 4.

Biased Multiobjective Optimization and Decomposition Algorithm.有偏多目标优化与分解算法。

IEEE Trans Cybern. 2017 Jan;47(1):52-66. doi: 10.1109/TCYB.2015.2507366. Epub 2016 Feb 3.

Hybridization of decomposition and local search for multiobjective optimization.分解与局部搜索的混合算法在多目标优化中的应用。

IEEE Trans Cybern. 2014 Oct;44(10):1808-20. doi: 10.1109/TCYB.2013.2295886.

Set-Based Discrete Particle Swarm Optimization Based on Decomposition for Permutation-Based Multiobjective Combinatorial Optimization Problems.基于分解的基于集合的离散粒子群优化算法求解基于排列的多目标组合优化问题

IEEE Trans Cybern. 2018 Jul;48(7):2139-2153. doi: 10.1109/TCYB.2017.2728120. Epub 2017 Aug 7.

引用本文的文献

Mission Sequence Model and Deep Reinforcement Learning-Based Replanning Method for Multi-Satellite Observation.基于任务序列模型和深度强化学习的多卫星观测重规划方法

Sensors (Basel). 2025 Mar 10;25(6):1707. doi: 10.3390/s25061707.

A Space Telescope Scheduling Approach Combining Observation Priority Coding with Problem Decomposition Strategies.一种将观测优先级编码与问题分解策略相结合的空间望远镜调度方法。

Biomimetics (Basel). 2024 Nov 21;9(12):718. doi: 10.3390/biomimetics9120718.

MTMol-GPT: De novo multi-target molecular generation with transformer-based generative adversarial imitation learning.MTMol-GPT：基于生成式对抗模仿学习的新型多靶点分子生成

PLoS Comput Biol. 2024 Jun 26;20(6):e1012229. doi: 10.1371/journal.pcbi.1012229. eCollection 2024 Jun.

Analysis of public opinion evolution of COVID-19 based on LDA-ARMA hybrid model.基于LDA-ARMA混合模型的新冠疫情舆情演变分析

Complex Intell Systems. 2021;7(6):3165-3178. doi: 10.1007/s40747-021-00514-7. Epub 2021 Sep 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于多目标优化的深度强化学习

Deep Reinforcement Learning for Multiobjective Optimization.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献