M.Sc Thesis

M.Sc StudentShimron Yuval
SubjectSmaller Footprint for Java Collections
DepartmentDepartment of Computer Science
Supervisor ASSOCIATE PROF. Joseph Gil
Full Thesis textFull thesis text - English Version


In dealing with the container bloat problem, we identify five memory compaction
techniques, which can be used to reduce the footprint of the small objects that make these containers. Using these techniques, we describe two alternative methods for more efficient encoding of the JRE’s ubiquitous HashMap data structure, and present a mathematical model in which the footprint of this can be analyzed.

First of this is our fused hashing encoding method, which reduces memory overhead by 20%-45% on a 32-bit environment and 45%-65% on a 64-bit environment. This encoding guarantees these figures as lower bound regardless of the distribution of keys in hash buckets. Second, our more opportunistic squashed hashing, achieves expected savings of 25%-70% on a 32-bit environment and 30%-75% on a 64-bit environments, but these savings can degrade and are not guaranteed against bad (and unlikely) distribution of keys to buckets. Timing results indicate that no significant improvement or degradation in runtime is noticeable for several common JVM benchmarks: SPECjvm2008, SPECjbb2005 and DaCapo. Naturally, some specific map operations are slowed down in compare to the simple base implementation due to a more complex and less straightforward implementation. We note that with the use of our compaction techniques, there is merit in an implementation of HashSet which is distinct from that of HashMap.

For TreeMap we show two encodings which reduces the overhead of tree nodes by 43% , 46% on a 32-bit environment and 55% , 73% on a 64-bit environment. These compactions give also a reason to separating the implementation of TreeSet from that of TreeMap. A separate implementation is expected to increase the footprint reduction to 59% , 54% on a 32-bit environment and 61% , 77% on a 64-bit environment.