Multi-Tenant Cross-Slice Resource Orchestration

A Deep Reinforcement Learning Approach

Xianfu Chen, Zhifeng Zhao, Celimuge Wu, Mehdi Bennis, Hang Liu, Yusheng Ji, Honggang Zhang

Research output: Contribution to journalArticleScientificpeer-review

Abstract

With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

Original languageEnglish
Pages (from-to)2377-2392
Number of pages16
JournalIEEE Journal on Selected Areas in Communications
Volume37
Issue number10
DOIs
Publication statusPublished - 2019
MoE publication typeA1 Journal article-refereed

Fingerprint

Reinforcement learning
Decision making
Communication
Scheduling
Experiments

Keywords

  • deep reinforcement learning
  • Markov decision process
  • mobile-edge computing
  • Network slicing
  • packet scheduling
  • radio access networks

Cite this

Chen, Xianfu ; Zhao, Zhifeng ; Wu, Celimuge ; Bennis, Mehdi ; Liu, Hang ; Ji, Yusheng ; Zhang, Honggang. / Multi-Tenant Cross-Slice Resource Orchestration : A Deep Reinforcement Learning Approach. In: IEEE Journal on Selected Areas in Communications. 2019 ; Vol. 37, No. 10. pp. 2377-2392.
@article{2324895f53764d33a56c05f1da1c9e22,
title = "Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach",
abstract = "With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.",
keywords = "deep reinforcement learning, Markov decision process, mobile-edge computing, Network slicing, packet scheduling, radio access networks",
author = "Xianfu Chen and Zhifeng Zhao and Celimuge Wu and Mehdi Bennis and Hang Liu and Yusheng Ji and Honggang Zhang",
year = "2019",
doi = "10.1109/JSAC.2019.2933893",
language = "English",
volume = "37",
pages = "2377--2392",
journal = "IEEE Journal on Selected Areas in Communications",
issn = "0733-8716",
publisher = "Institute of Electrical and Electronic Engineers IEEE",
number = "10",

}

Multi-Tenant Cross-Slice Resource Orchestration : A Deep Reinforcement Learning Approach. / Chen, Xianfu; Zhao, Zhifeng; Wu, Celimuge; Bennis, Mehdi; Liu, Hang; Ji, Yusheng; Zhang, Honggang.

In: IEEE Journal on Selected Areas in Communications, Vol. 37, No. 10, 2019, p. 2377-2392.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Multi-Tenant Cross-Slice Resource Orchestration

T2 - A Deep Reinforcement Learning Approach

AU - Chen, Xianfu

AU - Zhao, Zhifeng

AU - Wu, Celimuge

AU - Bennis, Mehdi

AU - Liu, Hang

AU - Ji, Yusheng

AU - Zhang, Honggang

PY - 2019

Y1 - 2019

N2 - With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

AB - With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

KW - deep reinforcement learning

KW - Markov decision process

KW - mobile-edge computing

KW - Network slicing

KW - packet scheduling

KW - radio access networks

UR - http://www.scopus.com/inward/record.url?scp=85070673256&partnerID=8YFLogxK

U2 - 10.1109/JSAC.2019.2933893

DO - 10.1109/JSAC.2019.2933893

M3 - Article

VL - 37

SP - 2377

EP - 2392

JO - IEEE Journal on Selected Areas in Communications

JF - IEEE Journal on Selected Areas in Communications

SN - 0733-8716

IS - 10

ER -