|M.Sc Student||Peleg Omer|
|Subject||Utilizing the IOMMU Scalably|
|Department||Department of Computer Science||Supervisor||Professor Dan Tsafrir|
I/O memory management units (IOMMUs) provided by modern hardware allow the operating system to enforce memory protection controls on the direct memory access (DMA) operations of its I/O devices. An IOMMU translation management design must scalably handle frequent concurrent updates of IOMMU translations made by multiple cores, which occur in high throughput I/O workloads such as multi-Gb/s networking. Today, however, operating systems experience performance meltdowns when using the IOMMU in such workloads.
This work explores scalable IOMMU management designs and addresses the two main bottlenecks we find in current operating systems: (1) assignment of I/O virtual addresses (IOVAs), and (2) management of the IOMMU's translation lookaside buffer (TLB).
We propose three approaches for scalable IOVA assignment: (1) dynamic identity mappings, which eschew IOVA allocation altogether, (2) allocating IOVAs using the kernel's kmalloc, and (3) per-core caching of IOVAs allocated by a globally-locked IOVA allocator. We further describe a scalable IOMMU TLB management scheme that is compatible with all these approaches.
Evaluation of our designs under Linux shows that (1) they achieve 88.5%-100% of the performance obtained without an IOMMU, (2) they achieve similar latency to that obtained without an IOMMU, (3) scalable IOVA allocation methods perform comparably to dynamic identity mappings because of their more efficient IOMMU page table management, and (4) kmalloc provides a simple solution with high performance, but can suffer from unbounded page table blowup if empty page tables are not freed.