|M.Sc Student||Boaz Aizenshtark|
|Subject||On the Performance of Echo State Networks|
|Department||Department of Electrical Engineering||Supervisors||Professor Emeritus Zeevi Yehoshua|
|Full Professor Meir Ron|
|Full Thesis text|
The Echo State Network (ESN), and its biologically-realistic counterpart, the Liquid State Machine (LSM), are two novel techniques for training and utilizing artificial recurrent neural networks. These techniques regard the network as an entity which performs on-line, non-linear computations on an input stream. The output of these calculations is manifested in the network state, which can be read by an external unit. In opposition to other neural network schemes, the training process does not modify the network itself, but alters the connections going to an external readout unit. Training, which is very simple, is done by linear regression. Nevertheless, the performance of such systems on many tasks is excellent.
Recent work has indicated that the performance of Echo State Networks is closely related to the network dynamics. The claim is that in diluted binary networks, the best performance is achieved when the network dynamics are neither ordered nor chaotic, but in a region called the ‘Edge of Chaos ’. In the first part of our work we re-examine the notion of ‘Computation at the Edge of Chaos’, by conducting a set of experiments on two types of systems. The first is the diluted binary network, and the second is a fully-connected binary network which is new in the context of Echo State Networks. We show that although their performance is related to dynamics, different tasks require different dynamics, so it is impossible to create a network which is universally optimal. We also show a generic trade off between the short-term and long-term memory capabilities of these networks.
Besides the issues of performance, existing work tries to explain how ESNs and LSMs work. Because these systems are in general very complex, the analysis is more intuitive than rigorous. In the second part of our work we offer a new approach to the existing ones. We treat networks which are driven by a random input process. We show that the network has two distinct roles. The first is to generate a random state process, and the second is to supply a reservoir of approximating functions. We finally inspect some properties of the state process, which allows the ESN to perform well.