Abstract
It is possible to implement the parallel random access machine (PRAM) on
a chip multiprocessor (CMP) efficiently with an emulated shared memory
(ESM) architecture to gain easy parallel programmability crucial to
wider penetration of CMPs to general purpose computing. This
implementation relies on exploitation of the slack of parallel
applications to hide the latency of the memory system instead of caches,
sufficient bisection bandwidth to guarantee high throughput, and
hashing to avoid hot spots in intercommunication. Unfortunately this
solution can not handle workloads with low thread-level parallelism
(TLP) efficiently because then there is not enough parallel slackness
available for hiding the latency. In this paper we show that integrating
non-uniform memory access (NUMA) support to the PRAM implementation
architecture can solve this problem and provide a natural way for
migration of the legacy code written for a sequential or multi-core NUMA
machine. The obtained PRAM-NUMA hybrid model is defined and
architectural implementation of it is outlined on our ECLIPSE ESM CMP
framework. A high-level programming language example is given.
Original language | English |
---|---|
Pages (from-to) | 21-35 |
Number of pages | 15 |
Journal | International Journal of Networking and Computing |
Volume | 1 |
Issue number | 1 |
Publication status | Published - 2011 |
MoE publication type | A1 Journal article-refereed |
Event | 12th Workshop on Advances in Parallel and Distributed Computational Models (APDCM) - Atlanta, United States Duration: 19 Apr 2010 → 23 Apr 2010 |
Keywords
- Parallel computing
- Computational models
- Thread-level parallelism
- PRAM
- NUMA