Ph.D Thesis

Ph.D StudentBlau Yacob Yochai
SubjectStatistical Constraints in Image Processing and
Machine Learning
DepartmentDepartment of Electrical and Computer Engineering
Supervisor ASSOCIATE PROF. Tomer Michaeli
Full Thesis textFull thesis text - English Version


Image restoration algorithms are typically evaluated by some distortion measure (e.g. PSNR, SSIM) or by human opinion scores that quantify perceived perceptual quality. In our work, we prove mathematically that distortion and perceptual quality are at odds with each other. Specifically, we study the optimal probability for correctly discriminating the outputs of an image restoration algorithm from real images. We show that as the mean distortion decreases, this probability must increase (indicating worse perceptual quality). As opposed to the common belief, this result holds true for any distortion measure, and is not only a problem of the PSNR or SSIM criteria. We also show that generative adversarial nets provide a principled way to approach the perception-distortion bound by minimizing a statistical constraint which is mathematically linked to the optimal rate for discriminating outputs from real images. This constitutes theoretical support to their observed success in low-level vision tasks. Based on our analysis, we propose a new methodology for evaluating image restoration methods, and use it to perform an extensive comparison between recent super-resolution algorithms. We also analyze popular full-reference and no-reference

image quality measures and draw conclusions regarding which of them correlates

best with human opinion scores.

Lossy compression is another domain where a common approach is to strive for the lowest possible distortion (at any given bit rate), as expressed by Shannons rate-distortion theory. However, in light of the tradeoff between perceptual quality and distortion, it is natural to seek for a generalization of rate-distortion theory which takes perceptual quality into account. We therefore introduce the statistical definition of perceptual quality into the classical rate-distortion function, to study the three-way tradeoff between rate, distortion, and perception. We show that restricting the perceptual quality to be high, generally leads to an elevation of the rate-distortion curve, thus necessitating a sacrifice in either rate or distortion. We prove several fundamental properties of this triple-tradeoff, calculate it in closed form for a Bernoulli source, and illustrate it visually on a toy MNIST example.

Finally, we study the role of statistical constraints in the context of dimensionality reduction. Spectral dimensionality reduction algorithms are a core technique, widely used for recognition, segmentation, tracking and visualization. However, despite their popularity, these algorithms suffer from a major limitation known as the repeated eigen-directions phenomenon. That is, many of the embedding coordinates they produce typically capture the same direction along the data manifold. This leads to redundant and inefficient representations that do not reveal the true intrinsic dimensionality of the data. We propose a general method for avoiding redundancy in spectral algorithms. Our approach relies on replacing the orthogonality constraint underlying those methods by a statistical unpredictability constraints. We prove that these constraints necessarily prevent redundancy, and provide a simple technique to incorporate them into existing methods. As we illustrate on challenging high-dimensional scenarios, our approach produces significantly more informative and compact representations, which improve visualization and classification tasks.