We are a computational team that researches algorithms for big biological data, such as next-generation sequencing data. Our background is in computer science and programming, yet the primary goal of the group is to perform computational research that is directly applicable to bioinformatics and biology. We are mainly interested in genomics and metagenomics, and accessorily in pan-genomics, transcriptomics and proteomics. Concretely, the group creates and implements algorithms and data structures into software tools, and also collaborates with biology groups.
Some examples of recent projects are the development of data structures to index large collections of sequencing datasets, methods for improving bacterial genome assemblies with long reads, and genome assembly and k-mers in general.
Our ongoing projects include research on k-mers, de Bruijn graphs, genome assembly using short reads and PacBio HiFi data, and the development of a search engine for all sequence data.