Link to Pubmed [PMID] – 24766276
J. Comput. Biol. 2014 Jul;21(7):534-47
Genome-scale metabolic model reconstruction is a complicated process beginning with (semi-)automatic inference of the reactions participating in the organism’s metabolism, followed by many iterations of network analysis and improvement. Despite advances in automatic model inference and analysis tools, reconstruction may still miss some reactions or add erroneous ones. Consequently, a human expert’s analysis of the model will continue to play an important role in all the iterations of the reconstruction process. This analysis is hampered by the size of the genome-scale models (typically thousands of reactions), which makes it hard for a human to understand them. To aid human experts in curating and analyzing metabolic models, we have developed a method for knowledge-based generalization that provides a higher-level view of a metabolic model, masking its inessential details while presenting its essential structure. The method groups biochemical species in the model into semantically equivalent classes based on the ChEBI ontology, identifies reactions that become equivalent with respect to the generalized species, and factors those reactions into generalized reactions. Generalization allows curators to quickly identify divergences from the expected structure of the model, such as alternative paths or missing reactions, that are the priority targets for further curation. We have applied our method to genome-scale yeast metabolic models and shown that it improves understanding by helping to identify both specificities and potential errors.