Link to Pubmed [PMID] – 10446248
Nucleic Acids Res. 1999 Sep;27(17):3567-76
We analysed the Bacillus subtilis protein coding sequences termini, and compared it to other genomes. The analysis focused on signals, com-positional biases of nucleotides, oligonucleotides, codons and amino acids and mRNA secondary structure. AUG is the preferred start codon in all genomes, independent of their G+C content, and seems to induce less stable mRNA structures. However, it is not conserved between homologous genes neither is it preferred in highly expressed genes. In B.subtilis the ribosome binding site is very strong. We found that downstream boxes do not seem to exist either in Escherichia coli or in B.subtilis. UAA stop codon usage is correlated with the G+C content and is strongly selected in highly expressed genes. We found less stable mRNA structures at both termini, which we related to mRNA-ribosome and mRNA-release-factor interactions. This pattern seems to impose a peculiar A-rich nucleotide and codon usage bias in these regions. Finally the analysis of all proteins from B.subtilis revealed a similar amino acid bias near both termini of proteins consisting of over-representation of hydrophilic residues. This bias near the stop codon is partially release-factor specific.