Code
1281026N
Duration
01 October 2025 → 30 September 2028
Funding
Research Foundation - Flanders (FWO)
Promotor
Research disciplines
-
Natural sciences
- Acoustics and acoustical devices, waves
- Marine pollution
-
Engineering and technology
- Marine engineering not elsewehere classified
- Acoustics, noise and vibration engineering
- Audio and speech computing
Keywords
audio-text captioning
marine soundscapes
artificial intelligence
Project description
Passive Acoustic Monitoring (PAM) is a powerful approach for studying marine environments, yet the vast volume of data generated makes manual analysis impractical. Current automated methods remain in early development stages, lacking robustness and scalability across diverse ecosystems. To fully unravel PAM's potential for soundscape analysis, innovative tools are needed. This research project tackles this challenge by proposing to take advantage of the recent development of large language models (LLMs), a promising field to explore for soundscape analysis as it allows experts to interact with the data in a more qualitative and descriptive way. By training contrastive language-audio models, this project aims to develop a system capable of creating an informative embedding space. This space would support robust characterization of soundscape dynamics across underwater ecosystems, enabling detailed ecosystem monitoring. The resulting embeddings could inform soundscape analysis, supporting assessments of ecosystem health. Furthermore, to favor a fast reaction to a detected change it is necessary to have access to near-real-time data. For this reason, the resulting model will be embedded so it can be used on-device to compute the embeddings and send them via satellite, bringing PAM real-time monitoring beyond expensive cabled observatories.