M.Sc Thesis | |
M.Sc Student | Yanay David |
---|---|
Subject | Supervised Learning of Semantic Relatedness |
Department | Department of Computer Science | Supervisor | PROF. Ran El-Yaniv |
Full Thesis text | ![]() |
We propose and study a novel
supervised approach to learning semantic relatedness from examples.
Using an empirical risk minimization approach our algorithm computes a weighted
measure of term co-occurrence with respect to a corpus of text documents, and
utilizes the labeled examples to fit the model to the training sample.
Our method is corpus independent and can essentially rely on any sufficiently
large (unstructured) collection of coherent texts.
We present the results of a range of experiments from large to small scale.
Evaluation over the WordSim353
benchmark show significant improvements in correlation results over the state-of-the-art
using either a reduced (older) version of Wikipedia or the books in the Project
Gutenberg collection.
These results indicate that the proposed method is effective and competitive
with the state-of-the-art.