We present ChIPflow, a ChIP-seq Snakemake-based pipeline following ENCODE guidelines for the identification and quantification of reproducible peaks from raw sequencing data. It also includes a statistical module to perform tailored differential marking and binding analysis with state of the art methods.ChIPflow streamlines critical steps like the quality assessment of the immunoprecipitation using cross-correlation and the selection of reproducible peaks between replicates for both narrow and broad peaks. It generates complete reports for data quality control assessment and optimal interpretation of the results.
We advocate for a differential analysis that accounts for the biological dynamics of each chromatin factor. Thus, ChIPflow provides linear and nonlinear methods for normalisation as well as conservative and stringent models for variance estimation and significance testing of the observed binding/marking differences.
Using a published ChIP-seq dataset, we show that distinct populations of differentially marked/bound peaks can be identified. We study their dynamics in terms of read coverage and summit position, as well as the expression of the neighbouring genes. We propose that ChIPflow can be used to measure the richness of the epigenomic landscape underlying a biological process by identifying diverse regulatory regimes.
The workflow is stored in a dedicated git repository.
Keywords: ChIP-seq, workflow, Snakemake, differential epigenomics, quantitative regulation