Offline Reinforcement Learning for Adaptive Control in Manufacturing Processes: A Press Hardening Case Study

Nuria Nievas, Leonardo Espinosa-Leal, Adela Pagès-Bernaus, Albert Abio, Lluís Echeverria, Francesc Bonada

Research output: Contribution to journalArticleScientificpeer-review

1 Citation (Scopus)

Abstract

This paper explores the application of offline reinforcement learning in batch manufacturing, with a specific focus on press hardening processes. Offline reinforcement learning presents a viable alternative to traditional control and reinforcement learning methods, which often rely on impractical real-world interactions or complex simulations and iterative adjustments to bridge the gap between simulated and real-world environments. We demonstrate how offline reinforcement learning can improve control policies by leveraging existing data, thereby streamlining the training pipeline and reducing reliance on high-fidelity simulators. Our study evaluates the impact of varying data exploration rates by creating five datasets with exploration rates ranging from epsilon = 0 to epsilon = 0.8. Using the Conservative Q-Learning algorithm, we train and assess policies against both a dynamic baseline and a static industry-standard policy. The results indicate that while offline reinforcement learning effectively refines behavior policies and enhances supervised learning methods, its effectiveness is heavily dependent on the quality and exploratory nature of the initial behavior policy.
Original languageEnglish
Article number011004
JournalJournal of Computing and Information Science in Engineering
Volume25
Issue number1
DOIs
Publication statusPublished - 2025
MoE publication typeA1 Journal article-refereed

Keywords

  • artificial intelligence
  • machine learning for engineering applications
  • manufacturing automation

Fingerprint

Dive into the research topics of 'Offline Reinforcement Learning for Adaptive Control in Manufacturing Processes: A Press Hardening Case Study'. Together they form a unique fingerprint.

Cite this