Workflows can get really complex If you turn your entire analysis project into a Snakemake workflow, you may end up with something like this. Sometimes, the standard Snakemake wildcard matching logic is not enough to express the connections you need to make, especially with complex combining steps.…
I want this post to describe a bug that I got recently that led me to use constraints on wildcards. The error was not intuitive and can be very disturbing at first if one is not used to the Snakemake logic. Below I will describe step by step how to build a minimal working example […]
Introduction Subdividing your pipeline into subworkflows is a powerful mean to achieve complex analysis. It also enables to avoid code repetition. Typically, if one has toanalyse different types of data such as ChIP-seq and RNA-seq, and then perform multi-omics data integration, it becomes handy to…
Introduction In this post, I want to test r2u, a very rapid and efficient tool to install R packages. It currently supports 19,066 and 18,921 binary packages from CRAN in “focal” and “jammy” respectively. It also supports 207 (focal) and 215 (jammy) BioConductor packages…
In the previous posts, we saw how to get started with snakemake, reduce command-line options, submit your jobs to a cluster, define resources and threads and handling memory and timeout errors. In this last post about snakemake profiles, I will show how to use singularity containers. If…
In the previous posts, we saw how to get started with snakemake, reduce command-line options, submit your jobs to a cluster and define resources and threads. However if one of your jobs fails because it uses more memory or time than requested, with what was covered so far, snakemake will…
In the previous posts, we saw how to get started with snakemake, reduce command-line options, and submit your jobs to a cluster. I ended the previous post by mentioning resources that were set by default. In this post, we will see how to adapt the amount of RAM, time, and the number of threads to…
The power of snakemake is to enable parallelization. Through the use of a cluster, jobs can be processed in parallel automatically. One needs to define how snakemake will handle the job submission process. If you already followed the two first posts (1,2), you can skip the first section.…
A profile is a folder that contains all the configuration parameters to successfully run your pipeline. Of note, if you have used a cluster.json file before, be aware that it has been deprecated. Preparation of files (if you skipped the first post) Run the following script to create the folder…
This blog post is the first of a series on creating snakemake profiles. Some content was directly copied from the snakemake manual. It is supposed that the reader has some basic concepts of snakemake, even if I start from the very beginning. Introduction Adapting Snakemake to a particular…