Download PDFOpen PDF in browserCan Reinforcement Learning Improve Order Decision in Multi-Echelon Inventory Systems? A Linear System Case StudyEasyChair Preprint 100958 pages•Date: May 12, 2023AbstractIn this paper, we introduce a novel formulation for the Markov Decision Process (MDP) model specifically tailored for linear inventory systems and present a cutting-edge reinforcement learning (RL) algorithm, termed Shaped-nStep-Double-DQN. By establishing various three-echelon linear inventory systems, we convert the pertinent order placement challenges into optimal policy determination problems within the MDP framework. The experiments demonstrate that the order placement strategies learned by the Shaped-nStep-Double-DQN algorithm in deterministic linear inventory systems are nearly consistent with the optimal order placement strategies, serving as a good approximation. In stochastic linear inventory systems, the ordering strategies learned by the Shaped-nStep-Double-DQN algorithm perform better than the base-stock policy, exhibiting superior inventory performance. Keyphrases: Multi-echelon Inventory Systems, Reinforcement Learning, Shaped-nStep-Double-DQN, order decision
|