Ph.D Thesis

Ph.D StudentYadgar Gala
SubjectMultilevel Cache Management Based on Application Hints
DepartmentDepartment of Computer Science
Supervisor PROF. Assaf Schuster
Full Thesis textFull thesis text - English Version


Modern storage systems are designed as standalone platforms, separated from their users and applications by strict protocols. This modularity allows for complex and dynamic system layouts. However, the separation between the storage and the application layers precludes inter-layer information sharing that is crucial for cooperation between the system's components  ̶  cooperation we believe will lead to substantial performance gains.

We propose to share information about the cached disk blocks and use it to make informed caching decisions. This information is currently available only in the application level, but is highly useful for managing the cache in all levels of the hierarchy. We define a new storage model in two stages.

In our initial model, a multilevel cache hierarchy is formed of multiple, non-cooperative clients accessing a single storage server. Multilevel caching introduces new challenges to traditional cache management: data must be kept in the appropriate cache and replication avoided across the various cache levels. However, clients competing for cache buffers can evict each other's blocks and interfere with exclusive caching schemes.

We present a global non-centralized, informed management policy for a multilevel cache hierarchy. Our algorithm, MC2, combines local, per client management with a global, system-wide scheme, to emphasize the positive effects of sharing and reduce the negative ones. Our local scheme, Karma, uses readily available information about the client's future access profile to save the most valuable blocks, and to choose the best replacement policy for them. The global scheme uses the same information to divide the shared cache space between clients, and to manage this space. Exclusive caching is maintained for non-shared data and is disabled when sharing is identified.

Our second model enables cooperative caching, where clients may access blocks directly from each other's caches. Previous studies treated all cooperating caches as a single pool, maximizing overall system performance at the price of possibly degraded performance for individual clients. In light of the growing scale of data consolidation, we re-evaluate the concept of cooperative caching, considering selfish clients that cooperate only if they benefit from the interaction.

We present and analyze several novel cooperative caching approaches for varying degrees of client selfishness. Our evaluation focuses on the performance as well as the energy requirements of these approaches, on a wide range of systems and workload domains. Our results provide initial insights regarding the effect client selfishness has on the performance of a cooperative storage cache.