Performance Simulation of Multi-processor Systems based on Load Reallocation: Master's thesis

Marko Jaakola

Research output: ThesisMaster's thesisTheses

Abstract

This work presents the novel method for high-level performance estimation of systems consisting of multiple computational units. The goal is to support system designers in the early phases of the system design flow. The focus mainly lies on embedded systems and in this first part of the work, we began from their versions which perform parallel processing with execution units similar to each other. Systems consisting of different types of processors, and the method expansions to support them are also discussed. The main idea was an attempt to reallocate a single processor's load to multiple simulated processors. The method uses measurements from actual, existing systems and relies on means of simulations with systems under design. Instead of competing with prototyping, the method is supposed to give an estimation of which kind of system architecture would fulfil the desired performance requirements. In the method, we process the mentioned measurement data automatically, which results in a so-called workload model. The workload model is then executed with a simulated system. This simulation run approximates the proposed system's estimated performance. Due to automation at the modelling phase and a high level of abstraction, the method allows the fast approximation of several different configurations. The first of the problem areas was to define which type of workload model is suitable and how it can be created. When the workload is measured from a uni-processor system, its parts which can be parallel executed must be discovered, in order to use the model with a multi-processor system. The second problem area is the modelling of the performance-related parts of the system under design. The larger problem is to study the validity and rationality of the whole method. We validated the method with two different test cases and both of them gave reasonable results. The first validation consists of a simple threaded application, which uses an inter-thread synchronization mechanism. As the internal functionality of the application is known, the characteristics of the method can be roughly seen. The second validation method is a real-world algorithm, which we will execute in both a simulated and existing two-processor system. The margin for error of the method can be calculated from the latter of the validation cases, by comparing the total execution times of the systems. The margin for error for this case was from 10 to 15 %. It was better than expected for a method with a rather high level of abstraction. As research results, the work presents the parts needed for the method: an instrumentation for gathering the measurement data, the creation of a workload model out of it, a simulation of a multi-processor system with the workload model, and visualization of the simulation results. In addition, an analysis of these parts and the whole method is presented.
Original languageEnglish
QualificationMaster Degree
Awarding Institution
  • University of Oulu
Supervisors/Advisors
  • Seppänen, Tapio, Supervisor, External person
  • Silvén, Olli, Supervisor, External person
Place of PublicationEspoo
Publisher
Electronic ISBNs978-951-38-7358-5
Publication statusPublished - 2009
MoE publication typeG2 Master's thesis, polytechnic Master's thesis

Fingerprint

Embedded systems
Synchronization
Automation
Visualization
Systems analysis
Processing

Keywords

  • parallelism
  • workload modelling

Cite this

Jaakola, M. (2009). Performance Simulation of Multi-processor Systems based on Load Reallocation: Master's thesis. Espoo: VTT Technical Research Centre of Finland.
Jaakola, Marko. / Performance Simulation of Multi-processor Systems based on Load Reallocation : Master's thesis. Espoo : VTT Technical Research Centre of Finland, 2009. 65 p.
@phdthesis{8290b3296f854f43b4370e86da0eb4ce,
title = "Performance Simulation of Multi-processor Systems based on Load Reallocation: Master's thesis",
abstract = "This work presents the novel method for high-level performance estimation of systems consisting of multiple computational units. The goal is to support system designers in the early phases of the system design flow. The focus mainly lies on embedded systems and in this first part of the work, we began from their versions which perform parallel processing with execution units similar to each other. Systems consisting of different types of processors, and the method expansions to support them are also discussed. The main idea was an attempt to reallocate a single processor's load to multiple simulated processors. The method uses measurements from actual, existing systems and relies on means of simulations with systems under design. Instead of competing with prototyping, the method is supposed to give an estimation of which kind of system architecture would fulfil the desired performance requirements. In the method, we process the mentioned measurement data automatically, which results in a so-called workload model. The workload model is then executed with a simulated system. This simulation run approximates the proposed system's estimated performance. Due to automation at the modelling phase and a high level of abstraction, the method allows the fast approximation of several different configurations. The first of the problem areas was to define which type of workload model is suitable and how it can be created. When the workload is measured from a uni-processor system, its parts which can be parallel executed must be discovered, in order to use the model with a multi-processor system. The second problem area is the modelling of the performance-related parts of the system under design. The larger problem is to study the validity and rationality of the whole method. We validated the method with two different test cases and both of them gave reasonable results. The first validation consists of a simple threaded application, which uses an inter-thread synchronization mechanism. As the internal functionality of the application is known, the characteristics of the method can be roughly seen. The second validation method is a real-world algorithm, which we will execute in both a simulated and existing two-processor system. The margin for error of the method can be calculated from the latter of the validation cases, by comparing the total execution times of the systems. The margin for error for this case was from 10 to 15 {\%}. It was better than expected for a method with a rather high level of abstraction. As research results, the work presents the parts needed for the method: an instrumentation for gathering the measurement data, the creation of a workload model out of it, a simulation of a multi-processor system with the workload model, and visualization of the simulation results. In addition, an analysis of these parts and the whole method is presented.",
keywords = "parallelism, workload modelling",
author = "Marko Jaakola",
year = "2009",
language = "English",
series = "VTT Publications",
publisher = "VTT Technical Research Centre of Finland",
number = "717",
address = "Finland",
school = "University of Oulu",

}

Jaakola, M 2009, 'Performance Simulation of Multi-processor Systems based on Load Reallocation: Master's thesis', Master Degree, University of Oulu, Espoo.

Performance Simulation of Multi-processor Systems based on Load Reallocation : Master's thesis. / Jaakola, Marko.

Espoo : VTT Technical Research Centre of Finland, 2009. 65 p.

Research output: ThesisMaster's thesisTheses

TY - THES

T1 - Performance Simulation of Multi-processor Systems based on Load Reallocation

T2 - Master's thesis

AU - Jaakola, Marko

PY - 2009

Y1 - 2009

N2 - This work presents the novel method for high-level performance estimation of systems consisting of multiple computational units. The goal is to support system designers in the early phases of the system design flow. The focus mainly lies on embedded systems and in this first part of the work, we began from their versions which perform parallel processing with execution units similar to each other. Systems consisting of different types of processors, and the method expansions to support them are also discussed. The main idea was an attempt to reallocate a single processor's load to multiple simulated processors. The method uses measurements from actual, existing systems and relies on means of simulations with systems under design. Instead of competing with prototyping, the method is supposed to give an estimation of which kind of system architecture would fulfil the desired performance requirements. In the method, we process the mentioned measurement data automatically, which results in a so-called workload model. The workload model is then executed with a simulated system. This simulation run approximates the proposed system's estimated performance. Due to automation at the modelling phase and a high level of abstraction, the method allows the fast approximation of several different configurations. The first of the problem areas was to define which type of workload model is suitable and how it can be created. When the workload is measured from a uni-processor system, its parts which can be parallel executed must be discovered, in order to use the model with a multi-processor system. The second problem area is the modelling of the performance-related parts of the system under design. The larger problem is to study the validity and rationality of the whole method. We validated the method with two different test cases and both of them gave reasonable results. The first validation consists of a simple threaded application, which uses an inter-thread synchronization mechanism. As the internal functionality of the application is known, the characteristics of the method can be roughly seen. The second validation method is a real-world algorithm, which we will execute in both a simulated and existing two-processor system. The margin for error of the method can be calculated from the latter of the validation cases, by comparing the total execution times of the systems. The margin for error for this case was from 10 to 15 %. It was better than expected for a method with a rather high level of abstraction. As research results, the work presents the parts needed for the method: an instrumentation for gathering the measurement data, the creation of a workload model out of it, a simulation of a multi-processor system with the workload model, and visualization of the simulation results. In addition, an analysis of these parts and the whole method is presented.

AB - This work presents the novel method for high-level performance estimation of systems consisting of multiple computational units. The goal is to support system designers in the early phases of the system design flow. The focus mainly lies on embedded systems and in this first part of the work, we began from their versions which perform parallel processing with execution units similar to each other. Systems consisting of different types of processors, and the method expansions to support them are also discussed. The main idea was an attempt to reallocate a single processor's load to multiple simulated processors. The method uses measurements from actual, existing systems and relies on means of simulations with systems under design. Instead of competing with prototyping, the method is supposed to give an estimation of which kind of system architecture would fulfil the desired performance requirements. In the method, we process the mentioned measurement data automatically, which results in a so-called workload model. The workload model is then executed with a simulated system. This simulation run approximates the proposed system's estimated performance. Due to automation at the modelling phase and a high level of abstraction, the method allows the fast approximation of several different configurations. The first of the problem areas was to define which type of workload model is suitable and how it can be created. When the workload is measured from a uni-processor system, its parts which can be parallel executed must be discovered, in order to use the model with a multi-processor system. The second problem area is the modelling of the performance-related parts of the system under design. The larger problem is to study the validity and rationality of the whole method. We validated the method with two different test cases and both of them gave reasonable results. The first validation consists of a simple threaded application, which uses an inter-thread synchronization mechanism. As the internal functionality of the application is known, the characteristics of the method can be roughly seen. The second validation method is a real-world algorithm, which we will execute in both a simulated and existing two-processor system. The margin for error of the method can be calculated from the latter of the validation cases, by comparing the total execution times of the systems. The margin for error for this case was from 10 to 15 %. It was better than expected for a method with a rather high level of abstraction. As research results, the work presents the parts needed for the method: an instrumentation for gathering the measurement data, the creation of a workload model out of it, a simulation of a multi-processor system with the workload model, and visualization of the simulation results. In addition, an analysis of these parts and the whole method is presented.

KW - parallelism

KW - workload modelling

M3 - Master's thesis

T3 - VTT Publications

PB - VTT Technical Research Centre of Finland

CY - Espoo

ER -

Jaakola M. Performance Simulation of Multi-processor Systems based on Load Reallocation: Master's thesis. Espoo: VTT Technical Research Centre of Finland, 2009. 65 p.