Since long, people have envisioned a realistic, virtual recreation of remote scenes, in which they can interact with the environment and with each other as if they were on site. Recent technological advancements have made it possible to capture the three-dimensional world with e.g., visual cameras, light field setups and LIDAR. Using these devices, first research efforts have been made to enable the users to both move their head and change their position within a virtual scene. Because interaction with the content becomes more crucial, requirements in terms of latency and bandwidth are strongly increased. This, in turn, makes it difficult to maintain the user experience with state-of-the-art solutions. Because the 5G paradigm envisions bandwidths in the order of Gb/s and latency bounded by the speed of light, coping with these requirements becomes a question of processing rather than physical transmission. This project aims to tackle this challenge by means of two complementary research lines. First, this project will investigate how three-dimensional objects can be efficiently captured, encoded and processed, using data culling and network orchestration techniques. Second, the project will research networking protocols and solutions to efficiently stream the content to the end user, thus optimizing the user experience and reducing the latency within the end-to-end system. The proposed approaches will be evaluated through objective measurements and subjective experimentation.