Ph.D Thesis

Ph.D StudentFrenkel Hadar
SubjectAutomata over Infinite Data Domains:Learnability and
Applications in Program Verification
and Repair
DepartmentDepartment of Computer Science
Supervisors PROFESSOR EMERITUS Orna Grumberg
DR. Sarai Sheinvald
Full Thesis textFull thesis text - English Version


This thesis focuses on different aspects of formal verification of systems over infinite data domains. Applications for such systems can be found in communication systems, e-commerce systems, large data bases and more.

A widely used approach in program verification is model-checking. The problem of model-checking is, given a system and a specification, to determine whether the system satisfies the specification (and thus the verification succeeds); or that the system violates the specification (thus verification fails) and some error is found, witnessing the violation. We focus on the automata-theoretic approach to model-checking. In this approach, both the system and the specification are modeled as finite automata, and model-checking is then reduced to reasoning about these automata.

However, real-life systems often contain infinitely many different configurations, as they refer to the data in the system, which is unbounded.  In this case, the model-checking of such systems does not scale well, and may even become undecidable.

In this thesis we use finite-state automata in order to model different types of such systems. First, we consider ongoing systems over infinite data domains, with respect to temporal specifications. An example for such a system is a server that communicates with an unknown number of clients. We propose a new automaton model that is able to capture the fragment of Exists* Variable LTL, which is an extension of LTL that allows reasoning about infinite data domains. We use this model to suggest a bounded model-checking algorithm for such systems, and we characterize decidable fragments of the logic, for which we suggest a complete model-checking process.

Next, we consider the model of communicating systems, which is most suitable to model communication and security protocols. We exploit the partition of the system into smaller components (e.g. server and clients). We also use the finite automata representation, to suggest a modular verification and repair algorithm that is based on automata learning using the L* algorithm.

Finally, we consider symbolic automata, whose alphabet is the set of predicates over some Boolean algebra. We study the L* algorithm in the context of symbolic automata, since, as we demonstrate, it is widely used in program verification. Next, we study a different learning paradigm, namely identification is the limit. To the best of our knowledge, this is the first time that this paradigm is considered in the context of infinite data domains. We suggest an automata learning algorithm for systems with data over the natural or real numbers, and present some complexity results regarding the learnability of symbolic automata under this paradigm.