Interactive 3-D Navigation based on Image plus Depth Representation
10am
Room 2610 (Lifts 31 & 32), 2/F Academic Building, HKUST

Examination Committee

Prof Kai TANG, MAE/HKUST (Chairperson)
Prof Weichuan YU, ECE/HKUST (Thesis Supervisor)
Prof Chia-Wen LIN, Department of Electrical Engineering, National Tsing Hua University (External Examiner)
Prof Ross MURCH, ECE/HKUST
Prof Shaojie SHEN, ECE/HKUST
Prof Gary CHAN, CSE/HKUST

 

Abstract

Interactive 3-D navigation enables the users to navigate interactively in the 3-D scene instead of watching the content at a fixed viewpoint determined by the media producer. Building such an interactive navigation system requires to consider a complete processing chain including 3-D scene representation, data compression and transmission, and (virtual) view synthesis. It should be noticed that a proper 3-D scene representation is important to the entire system as it influences the following processing modules. Image plus depth representation is currently the most popular and widely used photo-realistic representation for the 3-D scene. The depth map captures a 2-D projection of the 3-D geometry of the scene. With the help of the depth information, it is much easier to reconstruct a virtual view using DIBR (depth-image-based rendering) techniques. In this thesis, we study the practical solutions for the interactive 3-D navigation based on the image plus depth representation. 
 
We first study the acquisition and compression of depth maps. On the acquisition aspect, we study the depth estimation from a stereo image pair. We propose a convex approach to the discrete multi-labeling problem of stereo matching by reformulating it into a quadratic programming problem. On the compression aspect, we propose a novel distortion metric for depth maps in order to replace the conventional SSE (sum-of-squared error) metric, because the depth distortion affects the quality of synthesized views in a different way compared to the image distortion. Next, we move on to the problem of interactive 3-D navigation. We propose to organize the multiview image and depth data as navigation segments that can be decoded/reconstructed independently from the rest of the data. Navigation flexibility can be freely adjusted by changing the number and size of the navigation segments. Based on the proposed navigation segments, we further study practical solutions to the navigation problem based on 1-D and 2-D navigation segments respectively. In both cases, we consider an end-to-end system design and propose an optimization framework based on our novel rate and distortion models. Practical solving methods are further investigated for the 1-D and 2-D cases respectively in order to derive the optimal navigation segments that achieve the best trade-offs between various navigation criteria like resource consumptions, viewing quality and decoding complexity.

Speakers / Performers:
Rui MA
Language
English
Post an event
Campus organizations are invited to add their events to the calendar.