Link to Pubmed [PMID] – 15461426
J. Mol. Evol. 2004 Jun;58(6):692-700
CpG dinucleotide deficiency has been found in viruses, mitochondria, prokaryotes, and eukaryotes. The consensual explanation is that it is due to deamination of methylated cytosines, as established for vertebrate and plants. However, we still do not know whether C5 cytosine methylation is also the major cause of CpG deficiency in bacteria. By combining annotation and experimental data identifying the presence of C5 cytosine methyltransferases with analysis of CpG relative abundance in 67 bacterial species, we found that CpG relative abundance in most bacterial genomes that have cytosine C5 methyltransferases tends to be in the normal range (observed/expected values between 0.82 and 1.21). In contrast, many bacterial species likely to be lacking C5 cytosine methylation showed CpG deficiency. Furthermore, when comparing genomes with one another, TpG and CpA relative abundances were found to be independent from CpG relative abundance. This contrasted with intragenome analyses, where C3pG1 relative abundance (the subscripts refer to position of a nucleotide in a codon) was found to be generally positively correlated with T3pG1 relative abundances when plotted against GC content in protein coding sequences (CDSs). This suggests the existence of alternative mechanisms contributing to CpG deficiency in bacteria.