טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
Ph.D Thesis
Ph.D StudentAaron Wetzler
SubjectTopics in Computational Geometric Vision
DepartmentDepartment of Electrical Engineering
Supervisor Full Professor Kimmel Ron
Full Thesis textFull thesis text - English Version


Abstract

In this dissertation we explore various geometric elements along the computational vision pipeline. By combining geometric principles for shape analysis with modern sensing techniques, large-scale datasets and powerful computational architectures we show various new ways of enabling computers to better perceive, interpret and comprehend the geometry of the world around them. Specifically we explore the topics of reconstruction, filtering and semantic processing within the context of computational geometric vision. We start by briefly covering approaches for reconstructing geometry on mobile devices. We then develop a method of performing efficient photometric stereo which can model non-linear and near-source lighting setups while avoiding directly computing normal fields. Another aspect of processing which we focus on is that of denoising. Image and shape data which is obtained

through modern cameras is often noisy. We contribute a new patch based data denoising framework called the Patch-Beltrami filter for denoising gray-scale and color images and extend it to depth fields and 3D meshes. We then modify the approach by reformulating the differential operators used as trainable kernels in a deep neural network and unrolling the update step through time. We demonstrate state of the art results and highlight the fact that other PDE based methods could take advantage of the same basic idea. In the last part of this work we discuss the problem of localizing and identifying self similar geometric objects in a complex visual space. We specifically focus on identifying fingertips on articulating hands observed by depth cameras. We describe how we used high accuracy magnetic sensors to annotate large quantities of training data. To perform learning efficiently we turn to randomized trees and contribute a new approach for efficiently mapping the training of a random decision tree on billions of training samples with trillions of data points to a single multi-GPU computing node. Similarly for inference we describe a novel pipelined FPGA hardware implementation.