Ph.D Thesis

Ph.D StudentSandler Roman
SubjectNonnegative Matrix Factorization for Segmentation Analysis
DepartmentDepartment of Computer Science
Supervisor PROF. Michael Lindenbaum
Full Thesis textFull thesis text - English Version


The conducted research project is concerned with image segmentation - one of the central problems of image analysis.

A new model of segmented image is proposed and used to develop tools for analysis of image segmentations: image specific evaluation of segmentation algorithms' performance, extraction of image segment descriptors, and extraction of image segments. The proposed model, unlike prevalent ones, describes segmentations using image adaptive properties, which makes it relatively robust to context factors (e. g., texture). The image representation in the proposed terms may be obtained in a fully unsupervised process and it does not require learning from other images.

The proposed model characterizes the histograms, or some other additive feature vectors, calculated over the image segments as nonnegative combinations of basic histograms. It is shown that the correct (manually drawn) segmentations generally have similar descriptions in such representation. A specific algorithm to obtain such histograms and combination coefficients is proposed; it is based on nonnegative matrix factorization (NMF).

NMF approximates a given data matrix as a product of two low rank nonnegative matrices, usually by minimizing the L2 or the KL distance between the data matrix and the matrix product. This factorization was shown to be useful for several important computer vision applications. New NMF algorithms are proposed here to minimize two kinds of the Earth Mover's Distance (EMD) error between the data and the matrix product. We propose an iterative NMF algorithm (EMD NMF) and prove its convergence. The algorithm is based on linear programming. We discuss the numerical difficulties of the EMD NMF and propose an efficient approximation.

The advantages of the proposed combination of linear image model with sophisticated decomposition method are demonstrated with several applications: First, we use the boundary mixing weights to assess image segmentation quality in precision and recall terms without ground truth. We demonstrate a surprisingly high accuracy of the unsupervised estimates obtained with our method in comparison to human-judged ground truth. We use the proposed unsupervised measure to automatically improve the quality of popular segmentation algorithms. Second, we discuss the advantage of EMD NMF over L2-NMF in the context of two challenging computer vision tasks - face recognition and texture modeling. The latter task is built on the proposed image model and demonstrates its application to non-histogram features. Third, we show that a simple segmentation algorithm based on rough segmentation using bilateral EMD NMF performs well for many image types.