Implementation of multioperations in thick control flow processors

Martti Forsell, Jussi Roivainen, Ville Leppanen, Jesper Larsson Traff

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

Multioperations are primitives of parallel computation for which processors perform a reduction, e.g. addition, on values provided by multiple threads into a single value in a constant number of steps. Algorithmically, multioperations can speed up execution by a logarithmic factor over their single operation counterparts. In this paper, we propose an architectural technique for realizing multioperations in thick control flow processors. Thick control flows (TCF) are computational constructs that simplify parallel programming by bundling a number of homogeneous threads following the same control path into universalized vector-like entities. The elements of TCFs are called fibers to distinguish them from ordinary threads having their own individual control. Processors designed for executing TCFs feature a unique frontend-backend structure to provide low-latency processing of TCF-common computations and high-throughput execution of data parallel fibers. Our proposal relies on step caches and equally sized multioperation scratchpads, while on the memory side, we make use of active memory modules. The idea is to compute partial results in backend units to reduce the traffic to the referred shared memory location. The final result is then computed in the active memory unit of the target memory module. According to the evaluation made with our TCF-aware processor equipped with multioperation scratchpads and active memory units, it indeed executes certain N data element-algorithms log N times faster than the baseline processor. The cost of the implementation is preliminarily evaluated.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PublisherInstitute of Electrical and Electronic Engineers IEEE
Pages744-752
Number of pages9
ISBN (Electronic)978-1-5386-5555-9
ISBN (Print)978-1-5386-5556-6
DOIs
Publication statusPublished - 3 Aug 2018
MoE publication typeNot Eligible
Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
Duration: 21 May 201825 May 2018

Conference

Conference32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
CountryCanada
CityVancouver
Period21/05/1825/05/18

Fingerprint

Flow control
Data storage equipment
Parallel programming
Fibers
Throughput
Processing
Thread
Costs

Keywords

  • Multioperations
  • Parallel computing
  • Processor architecture
  • Reductions
  • TCF

Cite this

Forsell, M., Roivainen, J., Leppanen, V., & Traff, J. L. (2018). Implementation of multioperations in thick control flow processors. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 (pp. 744-752). [8425488] Institute of Electrical and Electronic Engineers IEEE. https://doi.org/10.1109/IPDPSW.2018.00121
Forsell, Martti ; Roivainen, Jussi ; Leppanen, Ville ; Traff, Jesper Larsson. / Implementation of multioperations in thick control flow processors. Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronic Engineers IEEE, 2018. pp. 744-752
@inproceedings{2daaa6f2edcb483b9e49f9e565546748,
title = "Implementation of multioperations in thick control flow processors",
abstract = "Multioperations are primitives of parallel computation for which processors perform a reduction, e.g. addition, on values provided by multiple threads into a single value in a constant number of steps. Algorithmically, multioperations can speed up execution by a logarithmic factor over their single operation counterparts. In this paper, we propose an architectural technique for realizing multioperations in thick control flow processors. Thick control flows (TCF) are computational constructs that simplify parallel programming by bundling a number of homogeneous threads following the same control path into universalized vector-like entities. The elements of TCFs are called fibers to distinguish them from ordinary threads having their own individual control. Processors designed for executing TCFs feature a unique frontend-backend structure to provide low-latency processing of TCF-common computations and high-throughput execution of data parallel fibers. Our proposal relies on step caches and equally sized multioperation scratchpads, while on the memory side, we make use of active memory modules. The idea is to compute partial results in backend units to reduce the traffic to the referred shared memory location. The final result is then computed in the active memory unit of the target memory module. According to the evaluation made with our TCF-aware processor equipped with multioperation scratchpads and active memory units, it indeed executes certain N data element-algorithms log N times faster than the baseline processor. The cost of the implementation is preliminarily evaluated.",
keywords = "Multioperations, Parallel computing, Processor architecture, Reductions, TCF",
author = "Martti Forsell and Jussi Roivainen and Ville Leppanen and Traff, {Jesper Larsson}",
year = "2018",
month = "8",
day = "3",
doi = "10.1109/IPDPSW.2018.00121",
language = "English",
isbn = "978-1-5386-5556-6",
pages = "744--752",
booktitle = "Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018",
publisher = "Institute of Electrical and Electronic Engineers IEEE",
address = "United States",

}

Forsell, M, Roivainen, J, Leppanen, V & Traff, JL 2018, Implementation of multioperations in thick control flow processors. in Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018., 8425488, Institute of Electrical and Electronic Engineers IEEE, pp. 744-752, 32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018, Vancouver, Canada, 21/05/18. https://doi.org/10.1109/IPDPSW.2018.00121

Implementation of multioperations in thick control flow processors. / Forsell, Martti; Roivainen, Jussi; Leppanen, Ville; Traff, Jesper Larsson.

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronic Engineers IEEE, 2018. p. 744-752 8425488.

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

TY - GEN

T1 - Implementation of multioperations in thick control flow processors

AU - Forsell, Martti

AU - Roivainen, Jussi

AU - Leppanen, Ville

AU - Traff, Jesper Larsson

PY - 2018/8/3

Y1 - 2018/8/3

N2 - Multioperations are primitives of parallel computation for which processors perform a reduction, e.g. addition, on values provided by multiple threads into a single value in a constant number of steps. Algorithmically, multioperations can speed up execution by a logarithmic factor over their single operation counterparts. In this paper, we propose an architectural technique for realizing multioperations in thick control flow processors. Thick control flows (TCF) are computational constructs that simplify parallel programming by bundling a number of homogeneous threads following the same control path into universalized vector-like entities. The elements of TCFs are called fibers to distinguish them from ordinary threads having their own individual control. Processors designed for executing TCFs feature a unique frontend-backend structure to provide low-latency processing of TCF-common computations and high-throughput execution of data parallel fibers. Our proposal relies on step caches and equally sized multioperation scratchpads, while on the memory side, we make use of active memory modules. The idea is to compute partial results in backend units to reduce the traffic to the referred shared memory location. The final result is then computed in the active memory unit of the target memory module. According to the evaluation made with our TCF-aware processor equipped with multioperation scratchpads and active memory units, it indeed executes certain N data element-algorithms log N times faster than the baseline processor. The cost of the implementation is preliminarily evaluated.

AB - Multioperations are primitives of parallel computation for which processors perform a reduction, e.g. addition, on values provided by multiple threads into a single value in a constant number of steps. Algorithmically, multioperations can speed up execution by a logarithmic factor over their single operation counterparts. In this paper, we propose an architectural technique for realizing multioperations in thick control flow processors. Thick control flows (TCF) are computational constructs that simplify parallel programming by bundling a number of homogeneous threads following the same control path into universalized vector-like entities. The elements of TCFs are called fibers to distinguish them from ordinary threads having their own individual control. Processors designed for executing TCFs feature a unique frontend-backend structure to provide low-latency processing of TCF-common computations and high-throughput execution of data parallel fibers. Our proposal relies on step caches and equally sized multioperation scratchpads, while on the memory side, we make use of active memory modules. The idea is to compute partial results in backend units to reduce the traffic to the referred shared memory location. The final result is then computed in the active memory unit of the target memory module. According to the evaluation made with our TCF-aware processor equipped with multioperation scratchpads and active memory units, it indeed executes certain N data element-algorithms log N times faster than the baseline processor. The cost of the implementation is preliminarily evaluated.

KW - Multioperations

KW - Parallel computing

KW - Processor architecture

KW - Reductions

KW - TCF

UR - http://www.scopus.com/inward/record.url?scp=85052218052&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2018.00121

DO - 10.1109/IPDPSW.2018.00121

M3 - Conference article in proceedings

SN - 978-1-5386-5556-6

SP - 744

EP - 752

BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

PB - Institute of Electrical and Electronic Engineers IEEE

ER -

Forsell M, Roivainen J, Leppanen V, Traff JL. Implementation of multioperations in thick control flow processors. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronic Engineers IEEE. 2018. p. 744-752. 8425488 https://doi.org/10.1109/IPDPSW.2018.00121