טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentBarash Dror
SubjectCache Manipulations Improve Multimedia Applications
DepartmentDepartment of Electrical Engineering
Supervisors Professor Emeritus Uri Weiser
Professor Emeritus Avinoam Kolodny
Full Thesis text - in Hebrew Full thesis text - Hebrew Version


Abstract

This paper deals with difficulties encountered by processors when running multimedia applications. These difficulties are the result of cache involved operations and of cache inherent properties. The source of these problems resides in the processes of prefetching instruction bytes from the memory into the cache and arranging them, and fetching data from the cache. These processes are inherent to the processor structure, which is built for uni-dimensional locality as used by the majority of applications, while multimedia applications require, in fact, a bi-dimensional locality. The uni-dimensional locality means that data is arranged in lines and stored in consecutive addresses, thus complying with the traditional processing techniques.

The multimedia data characteristics must be considered when processing it, taking into account its bi-dimensional nature.

One difficulty created by the use of a bi-dimensional cache is the problem of the required line length. The distance value is derived from the image size and quality, and though this information is not available to the processor, it can be retrieved from the program data and be made available to the processor.


If the cache is relatively small, the prefetched data may be erased before it has even been used. If the cache is large, a cache miss occurs when executing operations to all the data items of the first column, following by a cache miss-free execution to all other operations. In all cases, all the required data must be fetched and stored in the registers before any required calculation can be made.

Cache multiple access operations may be avoided by using a prefetch algorithm that would fetch bi-dimensional memory portions with all the required information. Using such algorithm would result in not only the fetch of the rest of the information row, but the fetch of the entire point information vicinity.

Another difficulty dealt with in this paper, is a deviation from the uni-dimensional principle made by a number of algorithms that are used by multimedia applications and require data items to be arranged in columns. Inspecting this difficulty, we examined an innovative solution, principally prefetching and returning cache columns, as opposed to the classical row fetching process, which is in common use. A sub pixel algorithm code was run on a simple scalar processor model together with the two previously mentioned solutions. The results demonstrated evident improvements in addition to up to six times decrease in the cache misses.