M.Sc Thesis


M.Sc StudentChapnik Koral
SubjectDALRING: Data-Aware Load Shedding in Complex
Event Processing Systems
DepartmentDepartment of Computer Science
Supervisors PROF. Assaf Schuster
DR. Ilya Kolchinsky


Abstract

Complex event processing (CEP) is widely employed to detect user-defined combinations, or patterns, of events in massive streams of incoming data.
Numerous applications such as healthcare, fraud detection, and more, use CEP technologies to capture critical alerts, potential threats, or vital notifications. This requires that the technology meet real-time detection constraints. Multiple optimization techniques have been developed to minimize the processing time for CEP, including parallelization techniques, pattern rewriting, and more. However, these techniques may not suffice or may not be applicable when an unpredictable peak in the input event stream exceeds the system capacity. In such cases, one immediate possible solution is to drop some of the load in a technique known as load shedding.

We present a novel, scalable, and efficient load shedding mechanism for CEP systems. DARLING (DAta dRiven Load sheddING) partitions the input event stream into dedicated buffers from which the CEP engine consumes events.
Considering arrival rates, cross-buffer correlations, and estimated processing complexity, DARLING assigns a global constraint on the size of the input event stream and local constraints on the size of each of these buffers. The former is utilized to detect overload situations, and the latter constraints dictate from which buffers to drop events and how much to drop. In addition, DARLING estimates the importance of each event with a utility function computed via data-driven statistical methods, which allows it to drop the least important events first. An extensive experimental evaluation on a broad set of real-life patterns and datasets demonstrates the superiority of our approach over the state-of-the-art techniques.