使用单一深度强化学习模型的多目标组合优化

Multiobjective Combinatorial Optimization Using a Single Deep Reinforcement Learning Model.

作者信息

Wang Zhenkun, Yao Shunyu, Li Genghui, Zhang Qingfu

出版信息

IEEE Trans Cybern. 2024 Mar;54(3):1984-1996. doi: 10.1109/TCYB.2023.3312476. Epub 2024 Feb 9.

DOI:10.1109/TCYB.2023.3312476

Abstract

This article proposes utilizing a single deep reinforcement learning model to solve combinatorial multiobjective optimization problems. We use the well-known multiobjective traveling salesman problem (MOTSP) as an example. Our proposed method employs an encoder-decoder framework to learn the mapping from the MOTSP instance to its Pareto-optimal set. Specifically, it leverages a novel routing encoder to extract information for both the entire multiobjective aspect and every individual objective from the MOTSP instance. The global embeddings and each objective's embeddings are adaptively aggregated via a routing network to form the subproblems' embedding that can well represent the MOTSP features. Using a modified context embedding, the subproblems' embeddings are fed into a decoder to produce a set of approximate Pareto-optimal solutions in parallel. Additionally, we develop a Top-k baseline to enable more efficient data utilization and lightweight training for our proposed method. We compare our method with heuristic-based and learning-based ones on various types of MOTSP instances, and the experimental results show that our method can solve MOTSP instances in real-time and outperform the other algorithms, especially on large-scale problem instances.

摘要

本文提出利用单个深度强化学习模型来解决组合多目标优化问题。我们以著名的多目标旅行商问题（MOTSP）为例。我们提出的方法采用编码器 - 解码器框架来学习从MOTSP实例到其帕累托最优集的映射。具体而言，它利用一种新颖的路由编码器从MOTSP实例中提取整个多目标方面以及每个单独目标的信息。全局嵌入和每个目标的嵌入通过路由网络进行自适应聚合，以形成能够很好地表示MOTSP特征的子问题嵌入。使用修改后的上下文嵌入，将子问题的嵌入输入到解码器中，以并行生成一组近似帕累托最优解。此外，我们开发了一种Top - k基线，以实现更高效的数据利用和我们提出方法的轻量级训练。我们在各种类型的MOTSP实例上，将我们的方法与基于启发式和基于学习的方法进行比较，实验结果表明，我们的方法可以实时解决MOTSP实例，并且优于其他算法，特别是在大规模问题实例上。

相似文献

Multiobjective Combinatorial Optimization Using a Single Deep Reinforcement Learning Model.

IEEE Trans Cybern. 2024 Mar;54(3):1984-1996. doi: 10.1109/TCYB.2023.3312476. Epub 2024 Feb 9.

A deep reinforcement learning algorithm framework for solving multi-objective traveling salesman problem based on feature transformation.

Neural Netw. 2024 Aug;176:106359. doi: 10.1016/j.neunet.2024.106359. Epub 2024 May 3.

Deep Reinforcement Learning for Multiobjective Optimization.

IEEE Trans Cybern. 2021 Jun;51(6):3103-3114. doi: 10.1109/TCYB.2020.2977661. Epub 2021 May 18.

The Collaborative Local Search Based on Dynamic-Constrained Decomposition With Grids for Combinatorial Multiobjective Optimization.

IEEE Trans Cybern. 2021 May;51(5):2639-2650. doi: 10.1109/TCYB.2019.2931434. Epub 2021 Apr 15.

Meta-Learning-Based Deep Reinforcement Learning for Multiobjective Optimization Problems.

IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7978-7991. doi: 10.1109/TNNLS.2022.3148435. Epub 2023 Oct 6.

Set-Based Discrete Particle Swarm Optimization Based on Decomposition for Permutation-Based Multiobjective Combinatorial Optimization Problems.

IEEE Trans Cybern. 2018 Jul;48(7):2139-2153. doi: 10.1109/TCYB.2017.2728120. Epub 2017 Aug 7.

Conditional Neural Heuristic for Multiobjective Vehicle Routing Problems.

IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4677-4689. doi: 10.1109/TNNLS.2024.3371706. Epub 2025 Feb 28.

Learning Feature Embedding Refiner for Solving Vehicle Routing Problems.

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15279-15291. doi: 10.1109/TNNLS.2023.3285077. Epub 2024 Oct 29.

Hybridization of decomposition and local search for multiobjective optimization.

IEEE Trans Cybern. 2014 Oct;44(10):1808-20. doi: 10.1109/TCYB.2013.2295886.

An accelerated end-to-end method for solving routing problems.

Neural Netw. 2023 Jul;164:535-545. doi: 10.1016/j.neunet.2023.05.003. Epub 2023 May 10.

引用本文的文献

A Reinforcement Learning-Based Bi-Population Nutcracker Optimizer for Global Optimization.

Biomimetics (Basel). 2024 Oct 1;9(10):596. doi: 10.3390/biomimetics9100596.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用单一深度强化学习模型的多目标组合优化

Multiobjective Combinatorial Optimization Using a Single Deep Reinforcement Learning Model.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献