SW/HW approach for optimizing the performance of synchronous shared memory architectures to low-TLP situations

    Research output: Contribution to conferenceConference articleScientific

    Abstract

    Synchronous shared memory (SSM) architectures are promising candidates for future CMP architectures due to their ability to execute general purpose parallel code efficiently down to the finest granularity and support for easy-to-use parallel program-ming models. While the recent SSM architectures are tuned for fast execution of parallel workloads and co-exploitation of ILP and TLP, the solutions used in them do not support efficient execution of low-TLP code fragments. More generally speaking, this inability of architectures optimized for parallel execution to efficiently execute sequential code has been shown to be one of the design bottlenecks in the theory of architectures. In this presentation we propose a SW/HW approach for dynamically optimizing the performance of recent SSM architectures to low-TLP situations. The HW part includes changes to the processor pipeline and instruction set as well as a new technique called bunching that combines execution slots of multiple threads into a single bunch executing a single thread with a speedup proportional to the number of threads. The SW part includes language level mechanism to support seamless bunching concur-rently with parallel execution. Preliminary evaluation of the approach is given.
    Original languageEnglish
    Publication statusPublished - 2009
    MoE publication typeNot Eligible
    EventScalable Approaches to High-Performance and High-Productivity Computing 2009, ScalPerf’09 - Bertinoro, Italy
    Duration: 20 Sep 200924 Sep 2009

    Conference

    ConferenceScalable Approaches to High-Performance and High-Productivity Computing 2009, ScalPerf’09
    Abbreviated titleScalPerf’09
    CountryItaly
    CityBertinoro
    Period20/09/0924/09/09

      Fingerprint

    Keywords

    • Parallel computing
    • computer architecture
    • CMP
    • PRAM
    • NUMA
    • optimization

    Cite this

    Forsell, M. (2009). SW/HW approach for optimizing the performance of synchronous shared memory architectures to low-TLP situations. Paper presented at Scalable Approaches to High-Performance and High-Productivity Computing 2009, ScalPerf’09, Bertinoro, Italy.