Ph.D Thesis

Ph.D StudentEinziger Gil
SubjectApproximate Compact Data Structures and Applications
DepartmentDepartment of Computer Science
Supervisor PROF. Roy Friedman
Full Thesis textFull thesis text - English Version


This thesis introduces two novel data structures: TinySet is optimized for the approximate set problem and TinyTable deals with the approximate counting problem. We show that both suggestions require less space than a variety of previous approaches. We also show that their operation speed can be faster than that of Bloom filters, especially for query operations.

We then present two novel cache management techniques: TinyLFU and Shades. In TinyLFU, approximate counting is used in order to estimate the recent frequency of encountered items. That estimation is then used to implement an approximate LFU admission policy. It is shown that TinyLFU significantly improves the hit rates of caches under a variety of real life workloads.

Shades utilizes a novel routing and caching approach in order to shorten both the average and median searches in Kademlia. Shades routing technique enables many TinyLFU augmented caches to work together in order to achieve high hit rates.  We show that Shades indeed shortens both the average and median lookup, without additional message and bandwidth overheads.