M.Sc Thesis

M.Sc StudentDotan Dolev
SubjectHyperFlow: a Visual, Ontology-Based Query and Data-Flow
Language for End-User Information Analysis
DepartmentDepartment of Computer Science
Supervisor PROF. Ron Pinter


As life-science research becomes increasingly information-centric, bioinformatics databases and analysis-services become increasingly abundant. However, currently the burden of finding the available databases and services, learning how to use each of them, and integrating their functions and formats into unified analysis processes typically falls on the slender shoulders of end-users with practically no programming skills. Thus, one of the major challenges of the bioinformatics community is to successfully harness data-integration, service-oriented architecture, ontologies and advanced user-interfaces towards the goal of enabling users to easily create intricate in-silico experiments as part of their research.

With this goal in mind, we have developed HyperFlow, a novel visual language for information analysis that combines into a unified framework features from visual dataflow and visual query languages. HyperFlow is designed to make it easier for users to retrieve, filter, and manipulate information, using databases alongside e.g. web services, in a transparent, intuitive, reproducible and traceable manner.  It allows users to visually design and execute information analysis processes in a single diagram. It lends itself both to the common programming mode of "design and execute" as well as to the ad-hoc mode of "interactive exploration", which is usually used in research. As a visual query language, its expressive power is at the top of its class. In addition, it may be used with relational and object oriented databases, as well as ontologies described in the highly-expressive Web Ontology Language (OWL). Finally, although it was designed with a particular application in mind, HyperFlow is completely domain-neutral.

This thesis describes HyperFlow's features as well as the characteristics and design of the prototype interface we have implemented. This interface is "ontology-aware": it uses the information in the ontology in order to guide the users through graphically composing queries; in addition, it makes use of Semantic-Web-style matchmaking in order to find applicable services to launch. Finally, the interface is being built on top of the Eclipse platform, which provides an extensible architecture that we intend to use towards further development of an extensive, open environment for bioinformatics analysis - the Bioinformatics Assay Environment.