NUMA computing with hardware and software co-support on configurable emulated shared memory architectures

Martti Forsell, E. Hansson, C. Kessler, J-M. Mäkelä, V. Leppänen

    Research output: Contribution to journalArticleScientificpeer-review

    Abstract

    The emulated shared memory (ESM) architectures are good candidates for future general purpose parallel computers due to their ability to provide an easy-to-use explicitly parallel syn- chronous model of computation to programmers as well as avoid most performance bottlenecks present in current multicore architectures. In order to achieve full performance the applications must, however, have enough thread-level parallelism (TLP). To solve this problem, in our ear- lier work we have introduced a class of configurable emulated shared memory (CESM) machines that provides a special non-uniform memory access (NUMA) mode for situations where TLP is limited or for direct compatibility for legacy code sequential computing and NUMA mechanism. Unfortunately the earlier proposed CESM architecture does not integrate the different modes of the architecture well together e.g. by leaving the memories for different modes isolated and therefore the programming interface is non-integrated. In this paper we propose a number of hardware and software techniques to support NUMA computing in CESM architectures in a seamless way. The hardware techniques include three different NUMA shared memory access mechanisms and the software ones provide a mechanism to integrate and optimize NUMA com- putation into the standard parallel random access machine (PRAM) operation of the CESM. The hardware techniques are evaluated on our REPLICA CESM architecture and compared to an ideal CESM machine making use of the proposed software techniques.
    Original languageEnglish
    Pages (from-to)189-206
    Number of pages18
    JournalInternational Journal of Networking and Computing
    Volume4
    Issue number1
    Publication statusPublished - 2014
    MoE publication typeA1 Journal article-refereed

    Fingerprint

    Memory architecture
    Computer hardware
    Data storage equipment
    Computer programming
    Interfaces (computer)

    Keywords

    • parallel computing
    • models of computation
    • programming model
    • shared memory emulation
    • NUMA
    • PRAM

    Cite this

    @article{8f3c690aac464aae91faf1b36961be24,
    title = "NUMA computing with hardware and software co-support on configurable emulated shared memory architectures",
    abstract = "The emulated shared memory (ESM) architectures are good candidates for future general purpose parallel computers due to their ability to provide an easy-to-use explicitly parallel syn- chronous model of computation to programmers as well as avoid most performance bottlenecks present in current multicore architectures. In order to achieve full performance the applications must, however, have enough thread-level parallelism (TLP). To solve this problem, in our ear- lier work we have introduced a class of configurable emulated shared memory (CESM) machines that provides a special non-uniform memory access (NUMA) mode for situations where TLP is limited or for direct compatibility for legacy code sequential computing and NUMA mechanism. Unfortunately the earlier proposed CESM architecture does not integrate the different modes of the architecture well together e.g. by leaving the memories for different modes isolated and therefore the programming interface is non-integrated. In this paper we propose a number of hardware and software techniques to support NUMA computing in CESM architectures in a seamless way. The hardware techniques include three different NUMA shared memory access mechanisms and the software ones provide a mechanism to integrate and optimize NUMA com- putation into the standard parallel random access machine (PRAM) operation of the CESM. The hardware techniques are evaluated on our REPLICA CESM architecture and compared to an ideal CESM machine making use of the proposed software techniques.",
    keywords = "parallel computing, models of computation, programming model, shared memory emulation, NUMA, PRAM",
    author = "Martti Forsell and E. Hansson and C. Kessler and J-M. M{\"a}kel{\"a} and V. Lepp{\"a}nen",
    year = "2014",
    language = "English",
    volume = "4",
    pages = "189--206",
    journal = "International Journal of Networking and Computing",
    issn = "2185-2839",
    publisher = "Hiroshima University",
    number = "1",

    }

    NUMA computing with hardware and software co-support on configurable emulated shared memory architectures. / Forsell, Martti; Hansson, E.; Kessler, C.; Mäkelä, J-M.; Leppänen, V.

    In: International Journal of Networking and Computing, Vol. 4, No. 1, 2014, p. 189-206.

    Research output: Contribution to journalArticleScientificpeer-review

    TY - JOUR

    T1 - NUMA computing with hardware and software co-support on configurable emulated shared memory architectures

    AU - Forsell, Martti

    AU - Hansson, E.

    AU - Kessler, C.

    AU - Mäkelä, J-M.

    AU - Leppänen, V.

    PY - 2014

    Y1 - 2014

    N2 - The emulated shared memory (ESM) architectures are good candidates for future general purpose parallel computers due to their ability to provide an easy-to-use explicitly parallel syn- chronous model of computation to programmers as well as avoid most performance bottlenecks present in current multicore architectures. In order to achieve full performance the applications must, however, have enough thread-level parallelism (TLP). To solve this problem, in our ear- lier work we have introduced a class of configurable emulated shared memory (CESM) machines that provides a special non-uniform memory access (NUMA) mode for situations where TLP is limited or for direct compatibility for legacy code sequential computing and NUMA mechanism. Unfortunately the earlier proposed CESM architecture does not integrate the different modes of the architecture well together e.g. by leaving the memories for different modes isolated and therefore the programming interface is non-integrated. In this paper we propose a number of hardware and software techniques to support NUMA computing in CESM architectures in a seamless way. The hardware techniques include three different NUMA shared memory access mechanisms and the software ones provide a mechanism to integrate and optimize NUMA com- putation into the standard parallel random access machine (PRAM) operation of the CESM. The hardware techniques are evaluated on our REPLICA CESM architecture and compared to an ideal CESM machine making use of the proposed software techniques.

    AB - The emulated shared memory (ESM) architectures are good candidates for future general purpose parallel computers due to their ability to provide an easy-to-use explicitly parallel syn- chronous model of computation to programmers as well as avoid most performance bottlenecks present in current multicore architectures. In order to achieve full performance the applications must, however, have enough thread-level parallelism (TLP). To solve this problem, in our ear- lier work we have introduced a class of configurable emulated shared memory (CESM) machines that provides a special non-uniform memory access (NUMA) mode for situations where TLP is limited or for direct compatibility for legacy code sequential computing and NUMA mechanism. Unfortunately the earlier proposed CESM architecture does not integrate the different modes of the architecture well together e.g. by leaving the memories for different modes isolated and therefore the programming interface is non-integrated. In this paper we propose a number of hardware and software techniques to support NUMA computing in CESM architectures in a seamless way. The hardware techniques include three different NUMA shared memory access mechanisms and the software ones provide a mechanism to integrate and optimize NUMA com- putation into the standard parallel random access machine (PRAM) operation of the CESM. The hardware techniques are evaluated on our REPLICA CESM architecture and compared to an ideal CESM machine making use of the proposed software techniques.

    KW - parallel computing

    KW - models of computation

    KW - programming model

    KW - shared memory emulation

    KW - NUMA

    KW - PRAM

    M3 - Article

    VL - 4

    SP - 189

    EP - 206

    JO - International Journal of Networking and Computing

    JF - International Journal of Networking and Computing

    SN - 2185-2839

    IS - 1

    ER -