|M.Sc Student||Bergman Shai|
|Subject||High Performance Disk I/O on GPUs|
|Department||Department of Electrical Engineering||Supervisor||Professor Mark Silberstein|
|Full Thesis text|
Recent GPUs enable Peer-to-Peer Direct Memory Access (P2PDMA) from fast peripheral devices like NVMe SSDs to exclude the CPU from the data path between them for efficiency. Unfortunately, using P2PDMA to access files is challenging because of the subtleties of low-level non-standard interfaces, which bypass the OS file I/O layers and may hurt system performance. Developers must possess intimate knowledge of low-level interfaces in order to manually handle the subtleties of data consistency and misaligned accesses.
We present SPIN, which integrates P2PDMA into the standard OS file I/O stack, dynamically activating P2PDMA where appropriate, transparently to the user. It combines P2PDMA with
page cache accesses, re-enables read-ahead for sequential reads, all while maintaining standard POSIX FS consistency, portability across GPUs and SSDs, and compatibility with virtual block devices such as software RAID.
We evaluate SPIN on NVIDIA and AMD GPUs using standard file I/O benchmarks, application traces and end-to-end experiments. SPIN achieves significant performance speedups across a wide range of workloads, exceeding P2PDMA throughput by up to an order of magnitude. It also boosts the performance of an aerial imagery rendering application by 2:6_ by dynamically adapting to its input-dependent file access pattern, enables 3:3_ higher throughput for a GPU-accelerated log server, and enables 29% faster execution for the highly optimized GPU-accelerated image collage with only 30 changed lines of code.