|M.Sc Student||Shai Furman|
|Subject||Multidimensional Image Representation and Processing|
Motivated by Human Vision
|Department||Department of Electrical Engineering||Supervisor||Professor Emeritus Zeevi Yehoshua|
|Full Thesis text|
A biological model of visual information representation is adopted. Images are represented accordingly in a multidimensional space that incorporates the well investigated dimensions of intensity, color and spatio-temporal frequency. The model is extended to incorporate additional, less investigated, dimensions such as curvature, size and depth (for example - from binocular disparity). Along these and other dimensions, that are yet to be discovered, the human visual system (HVS) enhances and emphasizes important image attributes by adaptation and nonlinear filtering. It is interesting and possible to emulate such visual system processing of images along these dimensions, in order to achieve intelligent image processing and computer vision. Likewise, such processing and analysis of images can contribute to better understanding of neurobiological sensory mechanisms.
The non-linear Automatic Gain Control (AGC) model of processing along the visual dimensions is presented together with its biological foundations. The model is analyzed for its SNR characteristics. Several inputs and responses are considered and implemented along the visual dimensions of curvature, size, depth and convex/concave. The results are compared with those of psychophysical experiments, exhibiting good reproduction of visual illusions.
Sparsely connected, recurrent adaptive sensory neural networks (NN), incorporating non-linear interactions in the feedback loops, are presented as an example of AGC implementation with generic artificial neural network (ANN).
Finally examples of applications of the AGC model in image processing and computer vision are presented. These include HDR images, enhanced edge detection and curve completion due to occlusion.
Implementing the generic neural AGC model along all visual dimensions constitutes a universal, parsimonious and unified model that proposes how our visual system processes visual information along its various dimensions, before the later stage of sequential “visual routines” is implemented. This approach may lead to the development of a metric for calculation of distance between images, and facilitate the execution of important tasks, such as recognition and classification.