טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentCarmi Adam
SubjectAdaptive Multi-Pass Parsing
DepartmentDepartment of Computer Science
Supervisors Professor Michael Kaminski
Professor Ron Pinter
Full Thesis textFull thesis text - English Version


Abstract

An adaptive grammar is a formal grammar that explicitly provides mechanisms within the formalism to allow self-induced manipulation of the production rules.  In the context of programming languages, adaptive grammars are useful for specifying (and parsing) syntax macros, and common context-sensitive syntactical restrictions, e.g., type correctness and scoping rules, traditionally referred to as the static-semantics aspects of programming languages. Although several adaptive grammar formalisms have been proposed since the early 70s, most fail to provide the mechanisms required for specifying syntactical constructs, where entities may appear before their point of declaration (e.g., goto statements, Java class members, etc.); and all but one fail to provide the mechanisms required for fully specifying syntax macro expansions.  Furthermore, the over-permissive adaptive nature of these formalisms prevents the development of efficient, practical parsers. In this thesis we propose an adaptive formalism which is more restrictive with respect to grammatical manipulation yet powerful enough to handle constructs commonly found in extensible programming languages. It suggests a multi-pass approach that goes hand in hand with the adaptive paradigm, and elegantly solves the two aforementioned issues.  Furthermore, by taking advantage of the restrictions, we are able to provide an LR(1)-based parsing algorithm that is amenable to practical implementation and handles both incremental and decremental changes in the grammar efficiently. We formally prove the correctness of the parser and analyze its complexity. To accommodate frequent grammar modifications, the parser lazily constructs its parse-table as parsing progresses. An efficient algorithm for computing LR(1) item sets is presented, that significantly improves upon previously known methods.