טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentHoron Reuven
SubjectReverse Engineering of Recurrent Neural Networks -
3 test cases
DepartmentDepartment of Electrical Engineering
Supervisor Professor Ron Meir
Full Thesis textFull thesis text - English Version


Abstract

Recurrent Neural Networks (RNNs) are known to be powerful computational models. They gained popularity a few decades ago, but due to difficulties in training these networks, they were largely abandoned. In recent years, mainly due to more powerful computers, abundance of training data, and improved algorithms, the use of RNNs has been revived, leading to state of the art results in various fields such as machine translation and speech recognition. Despite the success of RNNs on real world problems, the mechanisms underlying the functionality of these networks remain poorly understood, and they are often treated as a black box.


In this study, we wish to get a better understanding of these networks. To this end, RNNs are trained to solve three types of simple problems: a prediction task, a calculation task and a classification task. For each task, the structure and mechanisms of the trained networks are analyzed from a dynamical systems perspective. We find that the solutions obtained by different network realizations tend to be parsimonious and utilize the same general mechanisms, suggesting that the target function and network structure enforce strong constraints on the network's solution. Although RNNs solving a specific task share the same mechanisms, a variability in the implementation of these mechanisms, in different network realizations, is observed. This variability can be manifested in various forms such as different structure of a fixed state manifold, network activity that is not directly related to the specific task and changing location and number of fixed points.