Project

FRACTION: eFficient semantics-pReserving And deCenTralized processIng of Big Data spread across persONal data vaults

Code

BOF/24J/2021/388

Duration

01 October 2021 → 30 September 2025

Funding

Regional and community funding: Special Research Fund

Promotor

Ruben Verborgh

Research disciplines

Natural sciences
- Knowledge representation and reasoning
- Distributed systems
- Information retrieval and web search
- Information technologies
- Knowledge management
- Web information systems

Keywords

Decentralization Semantic reasoning and querying Decentralized scheduling of heterogeneous resources

Project description

Today, centralized stores govern the storage and processing of Big Data. Due to regulation (e.g. GDPR) and increasing awareness of people about data sensitivity, a paradigm shift towards decentralization is imminent. This allows people to control their own personal data, by guarding all public and private data they or others create about them in a vault, and selectively granting access to people and organizations of their choice. Future uses of Big Data are thus bound to shift from a small number of large datasets to a large number of small datasets. As such, the fundamental assumptions on which the current approaches are built to deal with the characteristics of Big Data (Volume, Variety & Velocity) are no longer valid, i.e. volume cannot be tackled by centralising data in a single location, high velocity data streams cannot be sent to a centralised data center and the variety problem cannot be resolved by imposing a single data format. FRACTION supports the shift to a decentralized approach by leveraging Semantic Web technologies. It will investigate 1) algorithms that autonomously distribute the analytics across the decentralized network, while hiding its complexity to the user, 2) decentralized and user-friendly data access control policies, and 3) methods to exploit the heterogeneity of the decentralized network to improve scalability and performance of the analytics.