Li Chen, Dong Wenhan, He Lei, Cai Ming, Wang Dafei
Air Force Engineering University, Xi'an, 710038, China.
Sci Rep. 2025 Mar 19;15(1):9418. doi: 10.1038/s41598-025-86229-y.
To tackle challenges such as convergence difficulties and suboptimal performance in the application of reinforcement learning to intelligent decision-making for joint operations, this study introduces an enhanced decision-making approach for joint operations utilizing an improved Proximal Policy Optimization (PPO) algorithm. We propose a structured intelligent decision-making model designed to execute decision-making functions effectively. The strategy loss mechanism is improved by constraining the upper limit of the strategy loss function. Furthermore, a priority sampling mechanism, is developed to assess sample values, thereby enhancing the efficiency of sampling training. Additionally, a network structure facilitating distributed interaction and centralized learning is designed to expedite the training process. The proposed method is then applied to a joint operations simulation platform for intelligent decision-making. Simulation results demonstrate that our algorithm successfully addresses the aforementioned issues, enabling autonomous decisions based on battlefield dynamics, and ultimately leading to victory.