M.Sc Thesis

M.Sc StudentSzarfman Dafna
SubjectReducing Cache Conflicts via Garbage Collection
DepartmentDepartment of Computer Science
Supervisor PROF. Erez Petrank


In this thesis we study ways to reduce cache conflict of programs using garbage collection. With the rapid improvement of processor speed, performance of the memory hierarchy has become the principal bottleneck for most applications.
A Cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data. Caches are divided into a number of Blocks. If the CPU finds a requested memory block that is held in the cache, it is called a cache hit. Any memory reference to an address not held in a block in the cache causes that block to be fetched from main memory. This event is called Cache Miss and the processor may have to be stalled for several clocks cycles until the block is retrieved from main memory. If the block is not in main memory, the processor will be stalled for thousands of clock cycles until the block is retrieved from the hard disk into memory and then into the cache. Cache performance is an important part of the total performance in modern computer system. The cost of a cache miss has grown as the speeds of modern processors and the complexity of their designs have increased.
Garbage collection is an effective technique for automatic storage reclamation and a widely used commercial language implementations. Garbage collection can also dramatically simplify programs, chiefly by allowing modules to present cleaner interfaces to each other: the management of object storage between modules is unnecessary. Most modern object-oriented languages such as Smalltalk, Eiffel C and Java are supported by garbage collection.
Cache performance can be improved by using the garbage collector to move objects so that most working data is kept in the cache.
In this work we will enhance an algorithm proposed by Chililmbi and Larus, which showed how to improve the memory performance of pointer-based programs by rearranging data at runtime.  Their algorithm detects objects that are used often one after the other and places them near each other in memory so they are likely to reside in the same memory block. Due to the new location of the object in the main memory, there will be a reduction in the cache miss rate: even if the first object in the block causes a miss, the second comes for free.
Our algorithm will still obtain the same benefits but in addition, it will also refrain from locating temporal affine objects at memory blocks that are mapped to the same cache block (and collide). Using designated benchmarks, we demonstrate the benefit of resolving cache conflicts and changing the mapping of program's objects to have as few as possible cache misses. Such mapping can significantly improve the cache behavior, thus, improving the performance of the program.