eFficient decentRAlized proCessing of high-velociTy streamIng data spread across heterogeneous persONal data vaults

01 January 2022 → 31 December 2025
Research Foundation - Flanders (FWO)
Research disciplines
  • Natural sciences
    • Knowledge representation and reasoning
    • Distributed systems
    • Information retrieval and web search
    • Information technologies
    • Computational logic and formal languages
semantic web GDPR
Project description

Today, service providers govern the centralized storage & processing of personal data. Due to regulation (e.g. GDPR) and increasing awareness about data sensitivity, a paradigm shift towards decentralization is imminent. This allows people to control their personal data, by guarding all public & private data they or others create about them in a vault, and selectively granting access to people & organizations of their choice. Services providing personal data analytics are thus bound to shift from processing a small number of centralized large data sets on homogeneous storage, to querying and processing a large number of small decentralized data sets on heterogeneous nodes. Constantly requesting & transferring all the needed data from the vaults to the various service providers is infeasible, inefficient and not scalable. The challenge is aggravated by the increasing availability of high-velocity streaming data resulting from a.o. wearables & social media. FRACTION supports the shift to a decentralized approach by leveraging Semantic Web technologies. It will investigate 1) algorithms that autonomously distribute the analytics across the decentralized network, while hiding its complexity to the user, 2) algorithms that exploit data locality and intermediate aggregations to process the high-velocity decentralized streaming data in a timely fashion, and 3) methods to exploit the heterogeneity of the decentralized network to improve scalability and performance of the analytics.