טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentHubara Itay
SubjectCompressing Neural Networks Using Binary Representation
DepartmentDepartment of Electrical Engineering
Supervisor Professor Ran El-Yaniv
Full Thesis textFull thesis text - English Version


Abstract

We introduce a method to train Binarized Neural Networks (BNNs) ? neural networks with binary weights and activations at run-time. At train-time the binary weights and activations are used for computing the parameter gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve powerefficiency. To validate the effectiveness of BNNs, we conducted two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. We also report our results on the challenging ImageNet dataset using AlexNet and GoogleNet topologies and demonstrate how 1-bit convolutions can accelerate training of low bitwidth neural network (not just binary) while achieving comparable prediction accuracy to their 32-bit counterparts. For example, our quantized version of AlexNet that has 1-bit weights and 2-bit activations achieves 51.1% top-1 accuracy. Last but not least, we programmed a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line