טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentSmadja Uzi
SubjectText Readability and its Relationship with Backtracking
Actions
DepartmentDepartment of Computer Science
Supervisors Professor Shaul Markovitch
Dr. Mor Naaman
Full Thesis textFull thesis text - English Version


Abstract

Rich engagement data can shed light on how people interact with online content and how such interactions may be determined by the content of the page. In this work, we investigate a specific type of interaction, backtracking, which refers to the action of scrolling back in a browser while reading an online news article. We leverage a dataset of close to 700K instances of more than 15K readers interacting with online news articles, in order to characterize and predict backtracking behavior. We first define and explore different types of backtracking actions. We then show that “full'' backtracks, where the readers eventually return to the spot at which they left the text, can be predicted by using features that have been shown to relate to text readability. This finding highlights the relationship between backtracking and readability and suggests that backtracking could help assess readability of content at scale, as backtracking events have the advantage of being easy to collect. Backtracking events have a further level of granularity compared to readability measures and can capture signals within the page and not just a page-wide signal. By controlling for types of readers and topics of news articles we will show that backtracking events are primarily caused by the textual features of the article.