-
Natural sciences
- Parallel programming
- Development of bioinformatics software, tools and databases
-
Engineering and technology
- Bio-informatics
Search schemes and a bidirectional index provide a new algorithmic framework for lossless approximate matching, where all approximate matches of a pattern P in a larger search text T are found. Nearly all bioinformatics sequence alignment tools use lossy approximate matching, as lossless approximate matching was historically slower. Search schemes promise to decrease this performance gap and could even be faster. Additionally, our research group has already realized a software prototype. This prototype confirms the increase in performance. We propose to develop algorithms for faster lossless approximate pattern matching based on search schemes and bidirectional full-text indices by taking into account a) the repeat structure of the search text; b) the specific properties of the search pattern. We propose to apply these algorithms to 1) sequence alignments to a linear genome; 2) sequence-to-graph-alignments; 3) alignment of long erroneous reads.