M.Sc Thesis


M.Sc StudentBarshatz Schneor Yamit
SubjectAccelerating Big-Data Sorting through Programmable Switches
DepartmentDepartment of Computer Science
Supervisor PROF. Roy Friedman
Full Thesis textFull thesis text - English Version


Abstract

Programmable switches are advanced network switches that can be programmed to perform certain actions besides routing packets. Programmable switches enable combining the logic and flexibility of software with the speed and efficiency of hardware and thus enjoy the benefits of both. Interestingly, programmable switches are also 2-3 orders of magnitude more power to compute efficient than GPUs and FPGAs. Alas, they offer a very restricted and non-intuitive programming model.


Sorting is a fundamental and well studied problem that has been studied extensively. Sorting plays a very important role in the area of databases, as many queries can be served much faster if the relations are first sorted. Merge sort is a very popular sorting algorithm in databases.


In modern data-centers, data is stored in storage servers, while processing takes place in compute servers. Hence, in order to compute queries on the data, the data must travel through the network from the storage servers to the computing servers. This creates a potential for using programmable switches to perform partial sorting in order to accelerate the sorting process at the server side. This is possible because, as mentioned above, the data packets pass through the switch in any case on their way to the server.


We devised a partial sorting algorithm that fits the programming model and restrictions of programmable switches. Our approach changes the order in which data is emitted from the switch. We also utilize built-in parallelism in the switch to divide the data into sequential ranges. Thus, the server needs to sort each range separately and then concatenate them to one sorted stream. The end result is that the server needs to sort smaller sections and each of these sections is already partially sorted. Hence, the server does less work, and the access pattern becomes more virtual memory friendly.


We examine several data stream compositions with various switch configurations. Our study exhibited an improvement of 20%-7 5 % in the sorting run-time when using our approach compared to plain sorting on the original stream. This suggests that our algorithm can significantly reduce the execution time at the servers.