Project

Multi-camera video restoration using side information

Code
01D21213
Duration
01 October 2013 → 30 September 2018
Funding
Regional and community funding: Special Research Fund
Research disciplines
  • Engineering and technology
    • Multimedia processing
    • Biological system engineering
    • Signal processing
Keywords
video restoration side information multicamera
 
Project description

The main novelty of the proposed research relates to the principle of distributed video restoration itself: we explicitly take into account the cost of communication and joint processing and define the multi-camera restoration problem as a set of loosely coupled single-camera restoration problems while still aiming to achieve optimal performance. Although Distributed Video Coding (DVC) adopts a similar philosophy. Our goals are: to develop a mathematical framework for restoration of video sequences using side information, to determine which type of side information is most appropriate, and to develop strategies for exchanging this side information. To truly demonstrate the power of the theory we will focus on a more challenging use case: that of restoring the depth sequences as captured by ToF or narrow base line stereo cameras (for which high-quality depth maps are hard to acquire with a single camera).

We will assume that the side information consists of two different types: rough and detailed. The rough side information is a rough estimate of objects in the scene along with their geometry. IPI has worked on this problem before and we know that this information can be computed without video communication using camera networks. The detailed side information is sent by other cameras upon request, with the restriction that the overall bandwidth for this side information is limited, for scalability reasons.

The first sub-objective is to define measures for judging the quality of local depth estimates as a function of time and spatial coordinates: only local estimates of insufficient quality merit the use of limited resources to improve them, and only high quality estimates merit the resources used to transmit them to other cameras. The existing IPI methods for super-resolution and denoising of video sequences already contain hidden variables indicative of the quality, and these will be used as a starting point.

A second sub-objective is to define a strategy for deciding which side information to share and in what form. We will assume that a fixed (small) bandwidth is available for transmitting side information and we will also assume a question and answer scenario: camera A first informs camera B about spatio-temporal regions in which side information is needed. Camera B then transmits this information; finally camera A uses it to improve its initial estimate of the depth map. In the initial stages of the research, the side information question-and-answer protocol will be based on heuristics. For instance, we know that estimates by a narrow baseline setup are accurate for objects close to the camera, but of poor quality for distant objects. In later stages of the research the question-and-answer strategy will be formulated in an information theoretical framework: this amounts to assessing the additional information content provided by each piece of possible side information on a training set and then relating this to local context in both the sender and the receiver. This information is subsequently used to define the optimal question and answer strategy. This problem is similar to, but more difficult than, the problem of task assignment in camera networks [13].

The third sub-objective is to create new algorithms for joint denoising and super-resolution of depth and video sequences taking into account the side information. The current IPI techniques for denoising and super-resolution are built on probabilistic models (Bayesian estimation framework). The side information needs to be incorporated in these models as an additional input, with the added difficulty that the side information is only available for some spatio-temporal regions of the sequence.