|Ph.D Student||Bitton David|
|Subject||Scene Modeling for Image-Based Rendering|
|Department||Department of Electrical Engineering||Supervisor||Professor Emeritus Arie Feuer|
|Full Thesis text|
Scene modeling from images is a fundamental computer vision problem. Its resolution has occupied researchers for decades, because it is an ill-posed problem. In particular, there is no guarantee that the solution is unique. The main focus of this thesis is to study the possibility of a new principled approach for scene modeling. To that end, scene modeling is formulated as an image-based rendering problem. We claim that the key for a principled approach to scene modeling is to adopt an inverse problem formulation aiming at correct image prediction, in order to resolve the ambiguities of scene modeling explicitly. Assuming that the scene reflectance is Lambertian, a generative scene model with embedded constraints is devised, so as to get a one-to-one relationship between instances of the model and their image predictions. As a result, an instance of the constrained model will be the unique solution of the scene modeling problem if it generates correct predictions for the available views. As the set of the available views grows, the embedded constraints become less stringent. At the limit, the existence of a unique solution is guaranteed.
Concretely, ambiguities arise when the scene radiance is constant in a region or when parts of the scene are not visible in the available views. The proposed model is based on a triangular mesh with constant color on each facet. The embedded constraints bear on the colors of adjacent facets and the visibility of the mesh edges. In order to test the model, a new estimation procedure based on simulated annealing was developed. The adaptive model resolution, which stems from the color constraints, insures the necessary robustness to avoid over-fitting.
The idea of resolving the ambiguities of scene modeling at the model design stage rather than the estimation stage can also be leveraged in the context of dense stereo from motion. The piecewise planar interpretation of visual scenes we just introduced gives rise to a piecewise constant homography motion model. The optical flow, which can be readily transformed into a depth map, is estimated through a variational optimization procedure minimizing the total variation of the motion model parameters, in addition to violations to the brightness constancy assumption.