Link to Pubmed [PMID] – 9390284
Link to HAL – -00000000
Pac Symp Biocomput. 1997:109-21.
The goal of the inverse folding problem is to supply a list of sequences compatible with a known protein structure. If two-body interactions are taken into account in energy calculations, an exhaustive exploration of the energy landscape in sequence space cannot be achieved because of the huge number of possible combinations. To circumvent this problem, we propose a method in which multiple copies corresponding to every possible side-chain type are attached to each C alpha position in the protein. The weights of each copy (stored in the sequence matrix SM) are refined using mean field theory: each side-chain copy interacts with the mean field generated by all possible side-chain copies at neighbouring positions, weighted by their respective probabilities. The potential energy is simply taken to be amino acid pair potentials of mean force. The method converges in a few cycles to a self-consistent solution. The refined matrix does not depend on the starting point; therefore the method succeeds in removing memory effects. Starting solely from the backbone of the known structure, and without information from the initial sequence, the final sequence matrix SM is shown to be able to retrieve significant sequence information, as observed through a series of structure-recognizes-sequence(s) computer experiments. The issue of specificity is discussed in detail.