A quantitative comparison of PRAM based emulated shared memory architectures to current multicore CPUs and GPUs

E. Hansson, E. Alnervik, C Kessler, Martti Forsell

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

The performance of current multicore CPUs and GPUs is limited in computations making frequent use of com- munication/synchronization between the subtasks executed in parallel. This is because the directory-based cache systems scale weakly and/or the cost of synchronization is high. The Emulated Shared Memory (ESM) architectures relying on multithreading and efficient synchronization mechanism have been developed to solve these problems affecting both performance and pro- grammability of current machines. In this paper, we compare preliminarily the performance of three hardware implemented ESM architectures with state-of-the-art multicore CPUs and GPUs. The benchmarks are selected to cover different patterns of parallel computation and therefore reveal the performance potential of ESM architectures with respect to current multicores.
Original languageEnglish
Title of host publicationARCS 2014 Workshop Proceedings
Number of pages7
Publication statusPublished - 2014
MoE publication typeA4 Article in a conference publication
Event27th International Conference on Architecture of Computing Systems, ARCS 2014 - Lübeck, Germany
Duration: 25 Feb 201428 Feb 2014

Conference

Conference27th International Conference on Architecture of Computing Systems, ARCS 2014
Abbreviated titleARCS 2014
CountryGermany
CityLübeck
Period25/02/1428/02/14

Fingerprint

Memory architecture
Program processors
Synchronization
Computer hardware
Communication
Graphics processing unit
Costs

Cite this

@inproceedings{864b3c4f6c46468bb13cf63a1bbbbb1d,
title = "A quantitative comparison of PRAM based emulated shared memory architectures to current multicore CPUs and GPUs",
abstract = "The performance of current multicore CPUs and GPUs is limited in computations making frequent use of com- munication/synchronization between the subtasks executed in parallel. This is because the directory-based cache systems scale weakly and/or the cost of synchronization is high. The Emulated Shared Memory (ESM) architectures relying on multithreading and efficient synchronization mechanism have been developed to solve these problems affecting both performance and pro- grammability of current machines. In this paper, we compare preliminarily the performance of three hardware implemented ESM architectures with state-of-the-art multicore CPUs and GPUs. The benchmarks are selected to cover different patterns of parallel computation and therefore reveal the performance potential of ESM architectures with respect to current multicores.",
author = "E. Hansson and E. Alnervik and C Kessler and Martti Forsell",
year = "2014",
language = "English",
isbn = "978-3-8007-3579-2",
booktitle = "ARCS 2014 Workshop Proceedings",

}

Hansson, E, Alnervik, E, Kessler, C & Forsell, M 2014, A quantitative comparison of PRAM based emulated shared memory architectures to current multicore CPUs and GPUs. in ARCS 2014 Workshop Proceedings. 27th International Conference on Architecture of Computing Systems, ARCS 2014, Lübeck, Germany, 25/02/14.

A quantitative comparison of PRAM based emulated shared memory architectures to current multicore CPUs and GPUs. / Hansson, E.; Alnervik, E.; Kessler, C; Forsell, Martti.

ARCS 2014 Workshop Proceedings. 2014.

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

TY - GEN

T1 - A quantitative comparison of PRAM based emulated shared memory architectures to current multicore CPUs and GPUs

AU - Hansson, E.

AU - Alnervik, E.

AU - Kessler, C

AU - Forsell, Martti

PY - 2014

Y1 - 2014

N2 - The performance of current multicore CPUs and GPUs is limited in computations making frequent use of com- munication/synchronization between the subtasks executed in parallel. This is because the directory-based cache systems scale weakly and/or the cost of synchronization is high. The Emulated Shared Memory (ESM) architectures relying on multithreading and efficient synchronization mechanism have been developed to solve these problems affecting both performance and pro- grammability of current machines. In this paper, we compare preliminarily the performance of three hardware implemented ESM architectures with state-of-the-art multicore CPUs and GPUs. The benchmarks are selected to cover different patterns of parallel computation and therefore reveal the performance potential of ESM architectures with respect to current multicores.

AB - The performance of current multicore CPUs and GPUs is limited in computations making frequent use of com- munication/synchronization between the subtasks executed in parallel. This is because the directory-based cache systems scale weakly and/or the cost of synchronization is high. The Emulated Shared Memory (ESM) architectures relying on multithreading and efficient synchronization mechanism have been developed to solve these problems affecting both performance and pro- grammability of current machines. In this paper, we compare preliminarily the performance of three hardware implemented ESM architectures with state-of-the-art multicore CPUs and GPUs. The benchmarks are selected to cover different patterns of parallel computation and therefore reveal the performance potential of ESM architectures with respect to current multicores.

M3 - Conference article in proceedings

SN - 978-3-8007-3579-2

BT - ARCS 2014 Workshop Proceedings

ER -