Project

Deep-Learning based Joint Audio, Video Processing for Augmented Listening

Code
01SC2722
Duration
01 October 2022 → 30 September 2026
Funding
Regional and community funding: Special Research Fund
Research disciplines
  • Natural sciences
    • Machine learning and decision making
    • Image processing
  • Engineering and technology
    • Audio and speech processing
    • Computer vision
    • Audio and speech computing
Keywords
Joint audio-video processing Deep learning Augmented reality
 
Project description

Augmented listening implies the extraction of desired audio signal(s) from a distorted capture. Like human perception of speech, where visual and acoustic cues jointly contribute to the understanding, we aim to improve this extraction by augmenting the audio with visual information. A side-application is the detection of inconsistent streams, hinting at deepfakes or otherwise compromised streams.