Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach

Xianfu Chen, Zhifeng Zhao, Celimuge Wu, Mehdi Bennis, Hang Liu, Yusheng Ji, Honggang Zhang

    Research output: Contribution to journalArticleScientificpeer-review

    1 Citation (Scopus)

    Abstract

    With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

    Original languageEnglish
    Pages (from-to)2377-2392
    Number of pages16
    JournalIEEE Journal on Selected Areas in Communications
    Volume37
    Issue number10
    DOIs
    Publication statusPublished - 2019
    MoE publication typeA1 Journal article-refereed

    Fingerprint

    Reinforcement learning
    Decision making
    Communication
    Scheduling
    Experiments

    Keywords

    • deep reinforcement learning
    • Markov decision process
    • mobile-edge computing
    • Network slicing
    • packet scheduling
    • radio access networks

    Cite this

    Chen, Xianfu ; Zhao, Zhifeng ; Wu, Celimuge ; Bennis, Mehdi ; Liu, Hang ; Ji, Yusheng ; Zhang, Honggang. / Multi-Tenant Cross-Slice Resource Orchestration : A Deep Reinforcement Learning Approach. In: IEEE Journal on Selected Areas in Communications. 2019 ; Vol. 37, No. 10. pp. 2377-2392.
    @article{2324895f53764d33a56c05f1da1c9e22,
    title = "Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach",
    abstract = "With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.",
    keywords = "deep reinforcement learning, Markov decision process, mobile-edge computing, Network slicing, packet scheduling, radio access networks",
    author = "Xianfu Chen and Zhifeng Zhao and Celimuge Wu and Mehdi Bennis and Hang Liu and Yusheng Ji and Honggang Zhang",
    year = "2019",
    doi = "10.1109/JSAC.2019.2933893",
    language = "English",
    volume = "37",
    pages = "2377--2392",
    journal = "IEEE Journal on Selected Areas in Communications",
    issn = "0733-8716",
    publisher = "IEEE Institute of Electrical and Electronic Engineers",
    number = "10",

    }

    Multi-Tenant Cross-Slice Resource Orchestration : A Deep Reinforcement Learning Approach. / Chen, Xianfu; Zhao, Zhifeng; Wu, Celimuge; Bennis, Mehdi; Liu, Hang; Ji, Yusheng; Zhang, Honggang.

    In: IEEE Journal on Selected Areas in Communications, Vol. 37, No. 10, 2019, p. 2377-2392.

    Research output: Contribution to journalArticleScientificpeer-review

    TY - JOUR

    T1 - Multi-Tenant Cross-Slice Resource Orchestration

    T2 - A Deep Reinforcement Learning Approach

    AU - Chen, Xianfu

    AU - Zhao, Zhifeng

    AU - Wu, Celimuge

    AU - Bennis, Mehdi

    AU - Liu, Hang

    AU - Ji, Yusheng

    AU - Zhang, Honggang

    PY - 2019

    Y1 - 2019

    N2 - With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

    AB - With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

    KW - deep reinforcement learning

    KW - Markov decision process

    KW - mobile-edge computing

    KW - Network slicing

    KW - packet scheduling

    KW - radio access networks

    UR - http://www.scopus.com/inward/record.url?scp=85070673256&partnerID=8YFLogxK

    U2 - 10.1109/JSAC.2019.2933893

    DO - 10.1109/JSAC.2019.2933893

    M3 - Article

    AN - SCOPUS:85070673256

    VL - 37

    SP - 2377

    EP - 2392

    JO - IEEE Journal on Selected Areas in Communications

    JF - IEEE Journal on Selected Areas in Communications

    SN - 0733-8716

    IS - 10

    ER -