We are a new computational team that researches algorithms for big biological data, such as next-generation sequencing data. Our background is in computer science and programming, yet the primary goal of the group is to perform computational research that is directly applicable to bioinformatics and biology. We are mainly interested in genomics and metagenomics, and accessorily in pan-genomics, transcriptomics and proteomics. Concretely, the group creates and implements algorithms and data structures into software tools, and also collaborates with biology groups.
Some examples of recent projects are the development of data structures to index large collections of sequencing datasets, methods for improving bacterial genome assemblies with long reads, and genome assemblies (giraffe, gorilla Y, mountain goat).
Our ongoing projects include the analysis of variants in Alzheimer’s disease whole-genome sequencing data, the development of algorithms on linked-reads sequencing data, and a search engine for all previously sequenced human RNA-seq experiments.