Novel knowledge from public life sciences data

01 January 2020 → 31 December 2024
Research Foundation - Flanders (FWO)
Research disciplines
  • Natural sciences
    • Bioinformatics and computational biology not elsewhere classified
    • Proteomics
life sciences
Project description

This application is a renewal of the Scientific Research Network (2015-W001015N) with the same title that runs from 2015 until end of 2019. This renewal updates two Flemish members of the consortium and adds an additional partner from the French speaking region. During the five years of the previous Research Network, important scientific contributions were already made by the consortium, as is evidenced from the many jointly organised events, jointly submitted grant proposals, and jointly authored publications (see the final report for details). And with the addition in this renewal application of three new, highly qualified and uniquely complementary research groups to the consortium, the scope and expertise of the Research Network expands even further. At the same time, because of changes in location and research focus, two groups are no longer part of this renewal proposal. The KULeuven unit of Prof. Jan Ramon (department head: Prof. Maurice Bruynooghe) is no longer part as Prof. Ramon has taken up a new position at INRIA in Lille, France, and has changed his research topic. The UGent group of Steven Maere is also no longer part of this renewal, as this group has also changed their research focus over the past years. In their place, three new groups are included. Two of these groups are included because of their strong focus on cutting edge statistical methods for omics data, which was identified as a missing element in the previous consortium. These two groups are the statOmics group of Prof. Lieven Clement at UGent, and the recently founded CBIO group of Prof. Laurent Gatto at the De Duve Institute at UCLouvain. The third new group in this application, the group of Prof. Giovanni Samaey at KULeuven, focuses on efficient yet performant numerical algorithms which have also been identified within the consortium as a bottleneck going forward, especially given the ever rising sizes of data sets, and the ever increasing complexity of the algorithms developed within the consortium to extract new knowledge from these data. Indeed, continuing from the previous Research Network, this consortium remains focused on optimally extracting value from the deluge of publicly available omics data that is produced by today’s high-throughput analytics in molecular biology. Each analysis characterises and quantifies thousands to hundreds of thousands of genes, transcripts, or proteins, and only small fractions of these vast amounts of information are actually needed for, and used in, the research context in which these data are being collected. However, because these data are steadily archived in ever-growing public data repositories, there is a tremendous amount of promise in re-analysing these data using sophisticated, special-purpose algorithms that can handle the massive data set sizes as well as their substantial heterogeneity. This is precisely where the combined expertise of the groups in this consortium comes in, as together they constitute an internationally acclaimed and already established network of leading scientists that are transforming the way that life scientists are looking at their own, and at others’, data. The main scientific objective of this Scientific Research Community is therefore to perform large-scale mining and analysis of public life sciences data in different domains, and then to integrate these findings into multiscale knowledgebases to provide context and uncover functional relevance. The final outcomes will thus not only be jointly organized events, joint national and international grant applications, and joint papers, but also reliable, easy-to-use online knowledgebases (similar to the LNCipedia, online Tabloid Proteome, and Scop3P resources that have been jointly developed in the previous Research Network) that can be accessed freely by any interested researcher in the life sciences. The research groups assembled here in this Research Network application are uniquely placed to perform this leading work, and can build on strong shared interests coupled with highly complementary skill sets (see group descriptions below). The planned work to be jointly undertaken (presented in more detail in the tabular from in the section on concrete planning below) is at the same time highly ambitious and necessarily highly interdisciplinary. And the only way in which the goals set out in this project can thus be achieved, is to have several expert groups interact closely and constructively. In the process, this Research Network will provide enormous benefit to the Flemish members, by firmly establishing these groups as absolute world leaders in the field of public omics data reprocessing, and will hence also benefit the Flemish research community by making Flanders the most attractive region in the world for ambitious and talented PhD students and postdocs to work in.