Project

Actionable techniques for exploration, exploitation and planning in dynamic recommender systems

Code

01P02321

Duration

01 October 2021 → 25 September 2022

Funding

Regional and community funding: Special Research Fund

Promotor

Bart Dhoedt

Fellow

Cedric De Boom

Research disciplines

Natural sciences
- Adaptive agents and intelligent robotics
- Data mining
- Machine learning and decision making
- Natural language processing
- Neural, evolutionary and fuzzy computation

Keywords

Recommender systems Exploration vs exploitation Planning Active inference Model capacity

Project description

Recommender systems are nowadays installed on many popular platforms such as streaming services, webshops, social media, etc. While this technology is mature, there are still a number of issues that remain to be solved. First, we need to tackle the long tail phenomenon by giving special treatment to the many less popular items. This provides challenges in small-scale and sparse data regimes. Second, we need to tone down the harmful echo chamber effect: modern-day recommenders only suggest similar items without explicitly venturing the user out of his comfort bubble. We can achieve this by making recommenders more proactive. And third, new items and changes in a user's context and taste can lead to data distribution shifts. Recommenders should be designed to deal with such kinds of non-stationary data. In this project, I will leverage sequential generative models and active inference as tools to tackle the these research questions. The latter will allow for inherent tradeoffs between exploration and exploitation, such that more long-tail items are suggested and echo chambers are shrunk. Next, the generative model will enable planning capabilities in order to steer the recommendation process explicitly in a transparant manner. Finally, we will also look at how we can optimize the capacity of the generative model, either statically or dynamically. Evaluation of our work will be executed in real-life online settings, since offline evaluation is prone to unwanted feedback loops.