Prototyping the MBTAC Processor for the REPLICA CMP

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

3 Citations (Scopus)

Abstract

Current chip multiprocessors (CMP) have mostly been designed by replicating sequential/single core processors and providing some support for operating them with a shared memory. As a result of this, they define asynchronous compu- tational model of threads, often require maximizing the local- ity of memory references to get decent performance, and fea- ture high intercommunication overheads, that make parallel programming tedious for general purpose functionalities. Most of these problems can be eliminated by designing the processors architecture for scalable general purpose comput- ing from the very beginning like done in processors for config- urable emulated shared memory (CESM) CMPs. They pro- vide support for machine instruction-level synchronization, make use of multithreading to support latency-insensitive computation, and promote the concept of uniform synchro- nous shared memory for easy variable allocation and conven- ient data exchange. In our earlier work we have proposed the first CESM architecture TOTAL ECLIPSE composed of early MBTAC processors making use of very low-overhead multi- threading, parallel computing savvy functional unit organiza- tion, support for fast synchronization between the instruc- tions and threads, and highly efficient multioperations. Unfortunately, certain key parts of these processors turned out to be hardly implementable and overall they lacked sup- port for ordered multiprefix operations and full configurabil- ity of the CESM scheme. In this paper we introduce a new fully configurable version of the MBTAC processor for our new REPLICA CESM architecture and the first FPGA imple- mentations of it. To evaluate it, we execute short test programs on it and compare it preliminary against Intel Core i7 and DLX processors. Our FPGA design flow and testing approach are described.
Original languageEnglish
Title of host publicationProceedings
Subtitle of host publicationIEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014
PublisherInstitute of Electrical and Electronic Engineers IEEE
Pages709-716
ISBN (Electronic)978-1-4799-4116-2
ISBN (Print)978-1-4799-4117-9
DOIs
Publication statusPublished - 2014
MoE publication typeA4 Article in a conference publication
Event28th IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014 - Phoenix, Arizona, United States
Duration: 19 May 201423 May 2014
Conference number: 28th

Conference

Conference28th IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014
Abbreviated titleIPDPSW 2014
CountryUnited States
CityPhoenix, Arizona
Period19/05/1423/05/14

Fingerprint

Data storage equipment
Memory architecture
Field programmable gate arrays (FPGA)
Synchronization
Parallel programming
Electronic data interchange
Parallel processing systems
Testing

Cite this

Forsell, M., Roivainen, J., & Leppänen, V. (2014). Prototyping the MBTAC Processor for the REPLICA CMP. In Proceedings: IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014 (pp. 709-716). Institute of Electrical and Electronic Engineers IEEE. https://doi.org/10.1109/IPDPSW.2014.82
Forsell, Martti ; Roivainen, Jussi ; Leppänen, V. / Prototyping the MBTAC Processor for the REPLICA CMP. Proceedings: IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014. Institute of Electrical and Electronic Engineers IEEE, 2014. pp. 709-716
@inproceedings{d86c2e54f5c045f8b134a05c120ddfa5,
title = "Prototyping the MBTAC Processor for the REPLICA CMP",
abstract = "Current chip multiprocessors (CMP) have mostly been designed by replicating sequential/single core processors and providing some support for operating them with a shared memory. As a result of this, they define asynchronous compu- tational model of threads, often require maximizing the local- ity of memory references to get decent performance, and fea- ture high intercommunication overheads, that make parallel programming tedious for general purpose functionalities. Most of these problems can be eliminated by designing the processors architecture for scalable general purpose comput- ing from the very beginning like done in processors for config- urable emulated shared memory (CESM) CMPs. They pro- vide support for machine instruction-level synchronization, make use of multithreading to support latency-insensitive computation, and promote the concept of uniform synchro- nous shared memory for easy variable allocation and conven- ient data exchange. In our earlier work we have proposed the first CESM architecture TOTAL ECLIPSE composed of early MBTAC processors making use of very low-overhead multi- threading, parallel computing savvy functional unit organiza- tion, support for fast synchronization between the instruc- tions and threads, and highly efficient multioperations. Unfortunately, certain key parts of these processors turned out to be hardly implementable and overall they lacked sup- port for ordered multiprefix operations and full configurabil- ity of the CESM scheme. In this paper we introduce a new fully configurable version of the MBTAC processor for our new REPLICA CESM architecture and the first FPGA imple- mentations of it. To evaluate it, we execute short test programs on it and compare it preliminary against Intel Core i7 and DLX processors. Our FPGA design flow and testing approach are described.",
author = "Martti Forsell and Jussi Roivainen and V. Lepp{\"a}nen",
year = "2014",
doi = "10.1109/IPDPSW.2014.82",
language = "English",
isbn = "978-1-4799-4117-9",
pages = "709--716",
booktitle = "Proceedings",
publisher = "Institute of Electrical and Electronic Engineers IEEE",
address = "United States",

}

Forsell, M, Roivainen, J & Leppänen, V 2014, Prototyping the MBTAC Processor for the REPLICA CMP. in Proceedings: IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014. Institute of Electrical and Electronic Engineers IEEE, pp. 709-716, 28th IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014, Phoenix, Arizona, United States, 19/05/14. https://doi.org/10.1109/IPDPSW.2014.82

Prototyping the MBTAC Processor for the REPLICA CMP. / Forsell, Martti; Roivainen, Jussi; Leppänen, V.

Proceedings: IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014. Institute of Electrical and Electronic Engineers IEEE, 2014. p. 709-716.

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

TY - GEN

T1 - Prototyping the MBTAC Processor for the REPLICA CMP

AU - Forsell, Martti

AU - Roivainen, Jussi

AU - Leppänen, V.

PY - 2014

Y1 - 2014

N2 - Current chip multiprocessors (CMP) have mostly been designed by replicating sequential/single core processors and providing some support for operating them with a shared memory. As a result of this, they define asynchronous compu- tational model of threads, often require maximizing the local- ity of memory references to get decent performance, and fea- ture high intercommunication overheads, that make parallel programming tedious for general purpose functionalities. Most of these problems can be eliminated by designing the processors architecture for scalable general purpose comput- ing from the very beginning like done in processors for config- urable emulated shared memory (CESM) CMPs. They pro- vide support for machine instruction-level synchronization, make use of multithreading to support latency-insensitive computation, and promote the concept of uniform synchro- nous shared memory for easy variable allocation and conven- ient data exchange. In our earlier work we have proposed the first CESM architecture TOTAL ECLIPSE composed of early MBTAC processors making use of very low-overhead multi- threading, parallel computing savvy functional unit organiza- tion, support for fast synchronization between the instruc- tions and threads, and highly efficient multioperations. Unfortunately, certain key parts of these processors turned out to be hardly implementable and overall they lacked sup- port for ordered multiprefix operations and full configurabil- ity of the CESM scheme. In this paper we introduce a new fully configurable version of the MBTAC processor for our new REPLICA CESM architecture and the first FPGA imple- mentations of it. To evaluate it, we execute short test programs on it and compare it preliminary against Intel Core i7 and DLX processors. Our FPGA design flow and testing approach are described.

AB - Current chip multiprocessors (CMP) have mostly been designed by replicating sequential/single core processors and providing some support for operating them with a shared memory. As a result of this, they define asynchronous compu- tational model of threads, often require maximizing the local- ity of memory references to get decent performance, and fea- ture high intercommunication overheads, that make parallel programming tedious for general purpose functionalities. Most of these problems can be eliminated by designing the processors architecture for scalable general purpose comput- ing from the very beginning like done in processors for config- urable emulated shared memory (CESM) CMPs. They pro- vide support for machine instruction-level synchronization, make use of multithreading to support latency-insensitive computation, and promote the concept of uniform synchro- nous shared memory for easy variable allocation and conven- ient data exchange. In our earlier work we have proposed the first CESM architecture TOTAL ECLIPSE composed of early MBTAC processors making use of very low-overhead multi- threading, parallel computing savvy functional unit organiza- tion, support for fast synchronization between the instruc- tions and threads, and highly efficient multioperations. Unfortunately, certain key parts of these processors turned out to be hardly implementable and overall they lacked sup- port for ordered multiprefix operations and full configurabil- ity of the CESM scheme. In this paper we introduce a new fully configurable version of the MBTAC processor for our new REPLICA CESM architecture and the first FPGA imple- mentations of it. To evaluate it, we execute short test programs on it and compare it preliminary against Intel Core i7 and DLX processors. Our FPGA design flow and testing approach are described.

U2 - 10.1109/IPDPSW.2014.82

DO - 10.1109/IPDPSW.2014.82

M3 - Conference article in proceedings

SN - 978-1-4799-4117-9

SP - 709

EP - 716

BT - Proceedings

PB - Institute of Electrical and Electronic Engineers IEEE

ER -

Forsell M, Roivainen J, Leppänen V. Prototyping the MBTAC Processor for the REPLICA CMP. In Proceedings: IEEE International Parallel & Distributed Processing Symposium Workshops, IPDPSW 2014. Institute of Electrical and Electronic Engineers IEEE. 2014. p. 709-716 https://doi.org/10.1109/IPDPSW.2014.82