טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentFriedman Lior
SubjectRecursive Feature Generation for Knowledge-Based
Induction
DepartmentDepartment of Computer Science
Supervisor Professor Shaul Markovitch
Full Thesis textFull thesis text - English Version


Abstract

When humans perform inductive learning, they often enhance the process by using
extensive background knowledge. Recently, a large collection of common-sense and
domain specific relational knowledge bases have become available on the web. With
the increasing availability of well-formed collaborative knowledge bases, it is
possible to significantly enhance the performance and accuracy of existing learning
by finding a way to effectively exploit these knowledge bases.

In this work, we present a novel supervised algorithm for injecting external
knowledge into induction algorithms using a feature generation framework. Given a
feature, the algorithm defines a new learning task over its set of values, and uses the
knowledge base to solve the constructed learning task. The resulting classifier is then
used as a new feature for the original problem. This approach allows us to make use
of existing methods in machine learning to better generate our features. We have
applied our algorithm to the domain of text classification using large semantic
knowledge bases such as YAGO2 and Freebase. We have shown that the generated
features significantly improve the performance of existing learning algorithms.
Additionally, we have shown that our approach performs significantly better than
another, unsupervised feature generation method, thus demonstrating the unique
benefits of our approach.