טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
Ph.D Thesis
Ph.D StudentMechrez Roey
SubjectStatistical image similarity for image transformation
DepartmentDepartment of Electrical Engineering
Supervisors Professor Yoav Schechner
Professor Lihi Zelnik-Manor
Dr. Eli Shechtman


Abstract

A key component of many algorithms in the domain of image transformation is the similarity measure. These measures are used in many frameworks, where it defines an objective for optimization problems. In this thesis, we present several new statistical similarity measures for image manipulation. The key idea, underlying this thesis, is guiding the internal image statistics to match a desired distribution. This is done by careful design of several new statistical measures which regularize the manipulation process. In the thesis we study three approaches for doing so.


First, we train Neural Networks for image manipulation that generate images with an internal feature distribution similar to that of the training images. This is done by designing an objective for training, called the Contextual Loss, which measures the similarity between the distributions of features of two images.  We suggest representing the images as a set of points in a high dimensional deep space and measuring the distance between the two sets of points.

The loss function enables us to efficiently train the network to manipulate the images in a way that maintains the internal feature distribution. Theoretical and empirical analysis of contextual loss reveals the connection to the Kullback-Leibler divergence of the two underlying distributions of images.

We employ contextual loss on a set of new applications where the training data is non-aligned. This is made possible by explicitly measuring the statistical similarity between two non-aligned images. 

Experiments on several applications show that training with contextual loss helps produce more realistic results.


Second, we show that matching the internal image statistics is also a key factor when using patch-based manipulation algorithms. This approach consists of an iterative optimization that replaces image patches with others and aims to satisfy two requirements simultaneously: manipulating the image and maintaining realistic internal statistics. We focus on manipulating images to control the center of attention when observing an image; i.e., the underlying saliency map.

Our solution is based on replacing image patches in the target regions with other patches from the same image. At each iteration, our patch-to-patch similarity considers the saliency and the internal image statistic of the given image. The outcome is the production of more realistic images.


Finally, we suggest manipulating the images in the gradient domain, a basic approach that modifies the image gradients to match natural statistics. We show via examples in style-transfer that we can take an image generated by a NN and reduce the non-realistic artifacts by manipulating its internal statistics.

Out solution is based on using the Screened Poisson Equation as post-processing. 

In practice, solving the Screened Poisson Equation involves combining two opposite considerations: (a) similarity to the content image in terms of structure and (b) similarity to the output image in terms of appearance (e.g. color). We demonstrate that our post-processing can combine the gradients of the input image with the color of the stylized version to achieve a photorealistic version that matches natural statistics.