Abstract
With recent advances in deep reinforcement learning, it is time to take another look at reinforcement learning as an approach for discrete production control. We applied proximal policy optimization (PPO), a recently developed algorithm for deep reinforcement learning, to the stochastic economic lot scheduling problem. The problem involves scheduling manufacturing decisions on a single machine under stochastic demand, and despite its simplicity remains computationally challenging. We implemented two parameterized models for the control policy and value approximation, a linear model and a neural network, and used a modified PPO algorithm to seek the optimal parameter values. Benchmarking against the best known control policy for the test case, in which Paternina-Arboleda and Das (2005) combined a base-stock policy and an older reinforcement learning algorithm, we improved the average cost rate by 2 %. Our approach is more general, as we do not require a priori policy parameters such as base-stock levels, and the entire policy is learned.
Original language | English |
---|---|
Pages (from-to) | 1415-1420 |
Number of pages | 6 |
Journal | IFAC-PapersOnLine |
Volume | 52 |
Issue number | 13 |
Early online date | 25 Dec 2019 |
DOIs | |
Publication status | Published - 2019 |
MoE publication type | A4 Article in a conference publication |
Event | 9th IFAC Conference on Manufacturing Modelling, Management and Control - Berlin, Germany Duration: 28 Aug 2019 → 30 Aug 2019 |
Keywords
- Reinforcement learning
- Stochastic economic lot scheduling
- Learning control
- Stochastic control
- Monte Carlo simulation
- Neural networks
- Machine learning