M.Sc Thesis

M.Sc StudentYuval Gad
SubjectArchitectural Considerations for DSP Processors Supporting
Intense Real-Time DSP Applications
DepartmentDepartment of Electrical and Computers Engineering
Supervisor PROF. Avi Mendelson
Full Thesis textFull thesis text - English Version


Emerging standards for wireless communication and multimedia involve complex Digital Signal Processing (DSP) algorithms that require support for both typical DSP kernels and control-oriented tasks. Modern DSP processor architectures must optimize their performance as well as their power consumption to meet the changing needs of the software environment driven by these very demanding standards.

In order to evaluate the performance and power consumption of future DSP architectures, three architectural approaches are compared?VLIW (Very Long Instruction Word), Vector, and Hybrid approach. Each of these architectures can take advantage of the inherent Instruction Level Parallelism (ILP) as well as Data Level Parallelism (DLP) that typically exist in DSP applications. However, while typical DSP code and numerical algorithms exhibit both ILP and DLP, control-oriented tasks have relatively little ILP and almost no DLP.

Unlike most other works that use “kernel based” benchmarks for evaluation, this work chooses two software frameworks that represent future workloads?LTE (Long Term Evolution) physical layer which is used in wireless communication and H.264 CODEC (enCOder DECoder) which is used for video compression in multimedia. The performance analysis present in this work is based on a Performance Accurate Simulator (PAC) which executes an optimized code for each of the architectures. Area and power consumption estimates are based on a synthesis of an existing DSP processor employing a dedicated power simulator.

The first part of the thesis examines the impact of future software environments on a single threaded architecture and indicates that a new-optimized DSP processor design aims at achieving maximum performance-to-power ratio, requiring a fine balance between flexibility and vectorization capabilities. The second part of the research focuses on extending the DSP architecture into multithreaded (MT) DSP architecture. This part extends the VLIW processor architecture, which is a known technique to increase the processor’s ILP by converting Threads Level Parallelism (TLP) into ILP. It concludes that SMT architecture provides significant performance improvement, but also increases the power consumption.

For all the architectures presented in this work, the research concludes that the performance to power ratio mostly depends on the static power consumption of the design. Designs with low static power consumption are not expected to gain much in performance to power ratio. We therefore show the importance of considering the power consumption of the caches related to the processor as well.