Link to Pubmed [PMID] – 9611243
Nucleic Acids Res. 1998 Jun;26(12):2971-80
We present a general analysis of oligonucleotide usage in the complete genome of Bacillus subtilis . Several datasets were built in order to assign various biological contexts to the biased use of words and to reveal local asymmetries in word usage that may be coupled with replication, the control of gene expression and the restriction/modification system. This analysis was complemented by cross-comparisons with the complete genomes of Escherichia coli , Haemophilus influenzae and Methanococcus jannaschii . We have observed a large number of biased oligonucleotides for words of size up to 8, throughout the datasets and species, indicating that such long strict words play an important role as biological signals. We speculate that some of them are involved in interactions with DNA and/or RNA polymerases. An extensive analysis of palindrome abundances and distributions provides the surprising result that prophage-like elements embedded in the genome exhibit a smaller avoidance of restriction sites. This may reinforce a recently proposed hypothesis of a selfish gene phenomena in the transfer of restriction/modification systems in bacteria.