Project

Search schemes for sequence alignment

Code
3S005121
Duration
01 November 2021 → 31 October 2025
Funding
Research Foundation - Flanders (FWO)
Promotor
Research disciplines
  • Natural sciences
    • Parallel programming
    • Development of bioinformatics software, tools and databases
  • Engineering and technology
    • Bio-informatics
Keywords
Sequence alignment Approximate string matching Graph alignment
 
Project description

Search schemes and a bidirectional index provide a new algorithmic framework for lossless approximate matching, where all approximate matches of a pattern P in a larger search text T are found. Nearly all bioinformatics sequence alignment tools use lossy approximate matching, as lossless approximate matching was historically slower. Search schemes promise to decrease this performance gap and could even be faster. Additionally, our research group has already realized a software prototype. This prototype confirms the increase in performance. We propose to develop algorithms for faster lossless approximate pattern matching based on search schemes and bidirectional full-text indices by taking into account a) the repeat structure of the search text; b) the specific properties of the search pattern. We propose to apply these algorithms to 1) sequence alignments to a linear genome; 2) sequence-to-graph-alignments; 3) alignment of long erroneous reads.