Project

ASPLASH - Audio, Speech and Language Processing and Analytics for Augmented Listening and Healthcare

Code
bof/baf/4y/2024/01/965
Duration
01 January 2024 → 31 December 2025
Funding
Regional and community funding: Special Research Fund
Research disciplines
  • Engineering and technology
    • Telecommunication and remote sensing
    • Audio and speech processing
    • Image and language processing
    • Pattern recognition and neural networks
    • Analogue and digital signal processing
    • Audio and speech computing
Keywords
Deep learning Speech processing Microphone arrays Audio analytics Audio enhancement Hearing enhancement Prior-informed machine learning hearing aids and healthcare Augmented reality
 
Project description

In our noisy, modern world, where communication is key, clarity of speech (in terms of intelligibility and intent) is paramount. Enabling crystal-clear conversations in any setting - from a bustling coffee shop to a noisy subway station - is our goal. Through machine-learning and signal processing, we will develop algorithms to enable hearables, like earbuds, hearing aids, or smart-glasses, to allow the listener to understand their communications partner in terms of content and context, while filtering out ambient interference. User-guided target indication - based on prior knowledge or run-time modalities is a key novelty. Other available or inferable acoustic information is also taken into account (e.g. location of different speakers, background scene, ...). Guided generative models further help restore heavily distorted speech. Intelligent integration of this information ensures holistic signal capture and enables intelligent analytics, with application to healthcare as well.