Repeatoire is a bioinformatics tool for locating DNA repeats inside of sequenced (but not necessarily assembled) genomes. Specifically, our focus is on repeat families, i.e. related sets of repeats that share a similar role and origin. A non-trivial question is what exactly is a DNA repeat? On the surface, a repeat is a subsequence of a given genome that resembles another subsequence in the same genome. Thus, one is usually interested in repeats unexpected in a random assembly of the genetic text. However, when considering the biology behind these repeats, they suddenly become much more exciting and complex. Biologically, repeats are a source of functional overlapping and sequence recombination. Repeats may include entire genes or operons, in which case functional redundancy arises from the overlapping of functions between the two copies. Also, recombination is heavily dependant on repeats as most recombination processes require some level of sequence similarity between the sequences for it to occur. Thus our intention with repeatoire is to identify these biologically relevant interspersed repeats that may be drivers of recombination or sources of functional overlap, or markers of other exciting phenomena.
Authors: Todd Treangen, Aaron Darling, Mark A. Ragan, Xavier Messeguer, Eduardo PC Rocha.