Free viewpoint (multiview) video aims at enabling virtual navigation in a dynamic 3D scene from
arbitrary camera locations and orientations. The associated technologies also signify the solution to
the problem of providing genuinely immersive “lasses-free”3D media, offering not only stereopsis,
but also motion parallax. Capturing high-quality multiview content from real-world scenes is an
enormous challenge, requiring a huge number of views from finely spaced angles. Unavoidably, the
number of captured views is much lower than what required thus, the missing views need to be
estimated. However, such view synthesis needs to account for the distances between objects and
cameras, complex 3D structure, illumination changes and non-Lambertian reflectance, none of
which are captured directly by single cameras. Existing solutions based on stereo or sparse
multiview camera setups fail to deliver acceptable 3D video quality. To solve this problem, we
propose a fundamental paradigm shift, focusing on multiview multimodal camera systems that
combine depth sensors and colour cameras of various spatial resolutions. Multimodal systems are
more versatile than linear camera arrays, thereby posing significant research challenges. The goal of
the project is to substantially advance over the state-of-the-art on multiview depth estimation and
view synthesis by designing radically new algorithms inspired by the new theoretical framework of
compressed sensing with side information.