טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentGendler Alexander
SubjectA Multi-Prefetcher Mechanism Based on a Prefecher Assessment
Buffer (PAB)
DepartmentDepartment of Electrical Engineering
Supervisors Professor Avi Mendelson
Professor Yitzhak Birk


Abstract

Prefetching mechanisms have been proven very effective in improving performance  for “memory bounded” applications. As the gap between processor speed and  memory access time increases, sophisticated and more aggressive prefetch algorithms are being developed, and modern processors are integrating different  types of them  as part of their micro-architecture.  Although the use of  sophisticated prefetching mechanisms yields significant performance gains for some  important applications, it substantially increases bus traffic and the “pressure” on the  cache tag arrays. In fact, the use of these aggressive prefetching mechanisms may even reduce performance for applications that are not memory bounded. This thesis  introduces a new “feedback” mechanism, termed Prefetcher Assessment  Buffer (PAB),  that aims to filter out requests that are unlikely to be useful. With this, applications  will benefit from more useful work of the prefetchers, and applications that  cannot benefit from these aggressive prefetching mechanisms will not suffer from their  side-effects.

The paper focuses on the use of the PAB to improve performance relative to the  simultaneous use of all the prefetchers while significantly reducing the number of  memory accesses, bus traffic and tag probes. The paper presents the use of a PAB  as part of different configurations such as “all L1 cache accesses trigger prefetching”  and “only misses to L1 trigger prefetching”. Application of the proposed  techniques to prefetching from main memory to the L2 cache can reduce the number of loads from main memory by up to 25% without losing performance. Application of more sophisticated techniques to prefetching between the L2- and L1-cache can increase IPC by 4% while reducing the traffic between the caches 8-fold.