טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentBen-Itzhak Yaniv
SubjectPerformance and Power Aware Thread Allocation for NoC CMP
DepartmentDepartment of Electrical Engineering
Supervisors Professor Emeritus Israel Cidon
Professor Emeritus Avinoam Kolodny
Full Thesis textFull thesis text - English Version


Abstract

We developed a performance and power aware efficient thread allocation heuristic for a coarse-grain multi-threaded Chip Multi Processor (CMP) connected by mesh NoC with shared memory architecture.

There is a clear trade-off between performance and power, and it is evident that the allocation of threads impacts both the performance and the power consumption of the CMP system. In order to maximize the performance, one would maintain a light load in each core. In such a case, the threads are thinly spread over many cores that in turn increase power consumption. On the other hand, running all threads by a single core would minimize the consumed power, since all of the other cores can be shut-down, but the performance would be greatly hurt.

This trade-off and the fact that power has become a critical constraint lead to new solutions to minimizing the consumed power and on the other hand maximizing the performance.

The research addresses this trade-off by using analytical models for both performance and power. Based on those models we define a parameterized performance/power metric which is adjusted according to a preferred tradeoff between performance and power. Varying operational and battery conditions may dynamically change the preferred tradeoff between performance and power.

To that end, we introduce an Iterative Threshold Algorithm (ITA) for allocating threads to cores in the case of a single application with symmetric threads. We extend this to a simple and efficient heuristic for the case of multiple applications. ITA utilizes the CMP resources in a way that maximizes a given parameterized performance/power metric.

We compare the performance/power metric value of the ITA with the results of several standard optimization methods. The algorithm outperforms the best of these methods, while consuming on average 0.01% and at most 2.5% of the computational effort.