IntControl LLC and CLION, The University of Memphis, 38152 Memphis, TN, USA.
Neural Netw. 2012 Aug;32:179-85. doi: 10.1016/j.neunet.2012.02.036. Epub 2012 Feb 16.
Large-scale networks with hundreds of thousands of variables and constraints are becoming more and more common in logistics, communications, and distribution domains. Traditionally, the utility functions defined on such networks are optimized using some variation of Linear Programming, such as Mixed Integer Programming (MIP). Despite enormous progress both in hardware (multiprocessor systems and specialized processors) and software (Gurobi) we are reaching the limits of what these tools can handle in real time. Modern logistic problems, for example, call for expanding the problem both vertically (from one day up to several days) and horizontally (combining separate solution stages into an integrated model). The complexity of such integrated models calls for alternative methods of solution, such as Approximate Dynamic Programming (ADP), which provide a further increase in the performance necessary for the daily operation. In this paper, we present the theoretical basis and related experiments for solving the multistage decision problems based on the results obtained for shorter periods, as building blocks for the models and the solution, via Critic-Model-Action cycles, where various types of neural networks are combined with traditional MIP models in a unified optimization system. In this system architecture, fast and simple feed-forward networks are trained to reasonably initialize more complicated recurrent networks, which serve as approximators of the value function (Critic). The combination of interrelated neural networks and optimization modules allows for multiple queries for the same system, providing flexibility and optimizing performance for large-scale real-life problems. A MATLAB implementation of our solution procedure for a realistic set of data and constraints shows promising results, compared to the iterative MIP approach.
在物流、通信和配送领域,具有数十万变量和约束的大规模网络变得越来越普遍。传统上,此类网络上定义的效用函数是使用线性规划(例如混合整数规划(MIP))的某种变体进行优化的。尽管在硬件(多处理器系统和专用处理器)和软件(Gurobi)方面都取得了巨大的进步,但我们已经达到了这些工具在实时环境中可以处理的极限。例如,现代物流问题需要扩展问题的范围,既可以在垂直方向上(从一天扩展到几天),也可以在水平方向上(将单独的解决方案阶段组合到一个集成模型中)。这种集成模型的复杂性需要替代的解决方案方法,例如近似动态规划(ADP),它为日常运营所需的性能提供了进一步的提高。在本文中,我们提出了基于较短时间段获得的结果来解决多阶段决策问题的理论基础和相关实验,作为模型和解决方案的构建块,通过批评者-模型-动作循环,将各种类型的神经网络与传统的 MIP 模型结合在一个统一的优化系统中。在这种系统架构中,快速而简单的前馈网络被训练为可以合理地初始化更复杂的递归网络,这些网络作为值函数(批评者)的逼近器。相关神经网络和优化模块的组合允许对同一系统进行多次查询,为大规模现实问题提供了灵活性和优化性能。与迭代 MIP 方法相比,我们的解决方案在一组实际数据和约束条件下的 MATLAB 实现显示出了有希望的结果。