Link to Pubmed [PMID] – 30502941
Meth. Enzymol. 2018;612:183-195
Microbial species thrive in very diverse environments and play fundamental roles in their equilibrium and dynamics. Metagenomics consists in extracting, sequencing, and studying the DNA present in ecosystems to better understand their regulation. Ideally, the maximal amount of information would be gathered from the full sequences of the genomes, episomes, and phages present in the microbial communities. Current high-throughput DNA sequencing produces reads ranging in size from a few dozen base pairs for the most commonly used technologies to several kb for emerging single-molecule real-time sequencing techniques. Although valuable information can be extracted from processing these DNA sequences into contigs, reconstructing full genomes remains a difficult task. Clustering contigs according to their similarities or read coverage covariations gives some insights on these genomes, but remains limited as viral sequences, or recent horizontal gene transfers, often differ from their host genomes. We recently developed meta3C, a proximity ligation approach that bins contigs in a sequence-independent way by quantifying and exploiting their tridimensional collisions frequencies in vivo. This technique has demonstrated a great potential to reconstruct genomes as well as to assign plasmids and phages to their hosts. It nevertheless requires a specific processing of the microbial samples before sequencing, which has to be carefully planned.