Practical Reinforcement Learning: Experiences in Lot Scheduling Application

Hannu Rummukainen, Jukka K. Nurminen

Research output: Contribution to journalArticle in a proceedings journalScientificpeer-review

Abstract

With recent advances in deep reinforcement learning, it is time to take another look at reinforcement learning as an approach for discrete production control. We applied proximal policy optimization (PPO), a recently developed algorithm for deep reinforcement learning, to the stochastic economic lot scheduling problem. The problem involves scheduling manufacturing decisions on a single machine under stochastic demand, and despite its simplicity remains computationally challenging. We implemented two parameterized models for the control policy and value approximation, a linear model and a neural network, and used a modified PPO algorithm to seek the optimal parameter values. Benchmarking against the best known control policy for the test case, in which Paternina-Arboleda and Das (2005) combined a base-stock policy and an older reinforcement learning algorithm, we improved the average cost rate by 2 %. Our approach is more general, as we do not require a priori policy parameters such as base-stock levels, and the entire policy is learned.
Original languageEnglish
Pages (from-to)1415-1420
Number of pages6
JournalIFAC-PapersOnLine
Volume52
Issue number13
Early online date25 Dec 2019
DOIs
Publication statusPublished - 2019
MoE publication typeA4 Article in a conference publication
Event9th IFAC Conference on Manufacturing Modelling, Management and Control - Berlin, Germany
Duration: 28 Aug 201930 Aug 2019

    Fingerprint

Keywords

  • Reinforcement learning
  • Stochastic economic lot scheduling
  • Learning control
  • Stochastic control
  • Monte Carlo simulation
  • Neural networks
  • Machine learning

Cite this