Abstract
The emulated shared memory (ESM) architectures are good
candidates for future general purpose parallel computers
due to their ability to provide an easy-to-use explicitly
parallel syn- chronous model of computation to
programmers as well as avoid most performance bottlenecks
present in current multicore architectures. In order to
achieve full performance the applications must, however,
have enough thread-level parallelism (TLP). To solve this
problem, in our ear- lier work we have introduced a class
of configurable emulated shared memory (CESM) machines
that provides a special non-uniform memory access (NUMA)
mode for situations where TLP is limited or for direct
compatibility for legacy code sequential computing and
NUMA mechanism. Unfortunately the earlier proposed CESM
architecture does not integrate the different modes of
the architecture well together e.g. by leaving the
memories for different modes isolated and therefore the
programming interface is non-integrated. In this paper we
propose a number of hardware and software techniques to
support NUMA computing in CESM architectures in a
seamless way. The hardware techniques include three
different NUMA shared memory access mechanisms and the
software ones provide a mechanism to integrate and
optimize NUMA com- putation into the standard parallel
random access machine (PRAM) operation of the CESM. The
hardware techniques are evaluated on our REPLICA CESM
architecture and compared to an ideal CESM machine making
use of the proposed software techniques.
Original language | English |
---|---|
Pages (from-to) | 189-206 |
Journal | International Journal of Networking and Computing |
Volume | 4 |
Issue number | 1 |
Publication status | Published - 2014 |
MoE publication type | A1 Journal article-refereed |
Keywords
- parallel computing
- models of computation
- programming model
- shared memory emulation
- NUMA
- PRAM