We are a new computational team that researches algorithms for big biological data, such as next-generation sequencing data. Our research roots are close to Computer Science, but the primary goal of the group is to apply research products to bioinformatics and biology. Our biological interests include genomics, metagenomics, pan-genomics, transcriptomics and proteomics. The group develops and implements algorithms and data structures into software tools, and also collaborates with biology groups.
Some examples of recent projects are the development of data structures to index large collections of sequencing datasets, methods for improving bacterial genome assemblies with long reads, and genome assemblies (giraffe, gorilla Y, mountain goat).
Our ongoing projects include the analysis of variants in Alzheimer’s disease whole-genome sequencing data, the development of algorithms on linked-reads sequencing data, and a search engine for all previously sequenced human RNA-seq experiments.