-
Natural sciences
- Single-cell data analysis
-
Medical and health sciences
- Analysis of next-generation sequence data
- Computational biomodelling and machine learning
- Computational transcriptomics and epigenomics
-
Engineering and technology
- Bio-informatics
Creating more expressive models of cells is a fundamental challenge in bioinformatics. Recently, single-cell sequencing has emerged as a way of characterizing biology with higher resolution, facilitating the construction of more comprehensive models. The technology has inherent limitations due to the lower amount of genetic material used, making the obtained data more noisy and incomplete. Current imputation and denoising methods fail to (1) contextualize predictions by modeling interactions both between cells and between biological properties of each cell and (2) incorporate the network structure of biological systems. In this proposed research project, we aim to solve these issues through the use of neural networks that mimic graph structure (i.e. transformers). These models scale quadratically with input size, making them computationally infeasible on high-dimensional sequencing data. Consequently, the main challenge is in the design of biologically-informed inductive biases, reducing the complexity while maintaining comparable performance. The proposed framework is generic in nature and is therefore expected to capture the intricacies of molecular life in a more meaningful way than current approaches. Because of this, its applicability in transfer learning and its use in generating novel biological insights will be explored during the research project.