{"id":748,"date":"2022-05-23T20:01:17","date_gmt":"2022-05-23T20:01:17","guid":{"rendered":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/?p=748"},"modified":"2022-05-23T20:01:17","modified_gmt":"2022-05-23T20:01:17","slug":"snakemake-profile-6-using-singularity-containers","status":"publish","type":"post","link":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/blog\/2022\/05\/snakemake-profile-6-using-singularity-containers\/","title":{"rendered":"Snakemake profile 6: Using singularity containers"},"content":{"rendered":"\n<div style=\"height:29px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>In the previous posts, we saw how to&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/blog\/2022\/03\/snakemake-profile-1-getting-started-with-snakemake\/\" target=\"_blank\">get started with snakemake<\/a>,&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/blog\/2022\/03\/snakemake-profile-2-reducing-command-line-options-with-profile\/\" target=\"_blank\">reduce command-line options<\/a>, <a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/blog\/2022\/04\/snakemake-profile-3-cluster-submission-defining-parameters\/\" target=\"_blank\">submit your jobs to a cluster<\/a>, <a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/blog\/2022\/04\/snakemake-profile-4-defining-resources-and-threads\/\" target=\"_blank\">define resources and threads<\/a> and <a rel=\"noreferrer noopener\" href=\"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/blog\/2022\/05\/snakemake-profile-5-handling-memory-and-timeout-errors\/\" target=\"_blank\">handling memory and timeout errors<\/a>. <strong>In this last post<\/strong> about snakemake profiles, I will show how to use singularity containers. If you followed the previous posts, deactivate your environment to continue:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/bash\n\nconda deactivate\ncd ..\n<\/code><\/pre>\n\n\n\n<div style=\"height:29px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Create a new project<\/h2>\n\n\n\n<div style=\"height:29px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Explaining how to create and use <a rel=\"noreferrer noopener\" href=\"https:\/\/apptainer.org\/user-docs\/master\/quick_start.html\" target=\"_blank\">singula<\/a>rities is out of the scope of this tutorial. Be aware that Singularity joined the Linux Foundation and rebranded as <a rel=\"noreferrer noopener\" href=\"https:\/\/apptainer.org\/getting-started\" target=\"_blank\">Apptainer<\/a>. Therefore, this part might quickly become out of date.<\/p>\n\n\n\n<p>In this section, we will use a singularity on a &#8220;toy example&#8221; rule as we did before. I will assume that you are able to create the singularity. Start with creating a project folder:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/bash\n\n# Create the folder containing the files needed for this tutorial\nmkdir snakemake-profile-singularity\n\n# Enter the created folder\ncd snakemake-profile-singularity\n\n# Create an empty file containing the snakemake code\ntouch snakeFile\n\n# Create toy input files\nmkdir inputs\necho \"toto\" &gt; inputs\/hello.txt\necho \"totoBis\" &gt; inputs\/helloBis.txt\n\n# Create an empty folder to create a conda environment\n# This is done to make sure that you use the same snakemake version as I do\nmkdir envs\ntouch envs\/environment.yaml\n\n# Create an empty folder to create a profile\nmkdir profile\ntouch profile\/config.yaml\n\n# Create a folder that will hold the singularity\nmkdir singularities\n<\/code><\/pre>\n\n\n\n<p>Use the following recipe to build the <code>fastqcv0119.sif<\/code> singularity and copy it to the <code>singularities<\/code> folder:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Bootstrap: docker\nFrom: biocontainers\/fastqc:v0.11.9_cv7\n\n%runscript\n    echo \"Running container biocontainers\/fastqc:v0.11.9_cv7, FastQC v0.11.9\"\n    exec \/bin\/bash \"$@\"\n<\/code><\/pre>\n\n\n\n<p>Copy the following content to <code>envs\/environment.yaml<\/code> (the indentations consist of two spaces):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>channels:\n  - bioconda\ndependencies:\n  - snakemake-minimal=6.15.1\n<\/code><\/pre>\n\n\n\n<p>Then execute the following commands to create and use a conda environment containing snakemake v6.15.1:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/bash\n\nconda env create -p envs\/smake --file envs\/environment.yaml\nconda activate envs\/smake\n<\/code><\/pre>\n\n\n\n<div style=\"height:29px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Using singularities<\/h2>\n\n\n\n<div style=\"height:29px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Copy the following content to <code>snakeFile<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>onstart:\n    print(\"##### TEST #####\\n\") \n    print(\"\\t Creating jobs output subfolders...\\n\")\n    shell(\"mkdir -p jobs\/helloSingularity\")\n\nFILESNAMES=&#91;\"hello\", \"helloBis\"]\n\nrule all:\n    input:\n        expand(\"results\/{recipient}-sing.txt\", recipient=FILESNAMES)\n\nrule helloSingularity:\n    input:\n        \"inputs\/{recipient}.txt\"\n    output:\n        \"results\/{recipient}-sing.txt\"\n    threads: 1\n    singularity: \"singularities\/fastqcv0119.sif\"\n    shell:\n        \"\"\"\n        cat {input} &gt; {output}\n        \"\"\"        \n<\/code><\/pre>\n\n\n\n<p>We added a new <code>singularity<\/code> section to the rule that can contain the absolute or relative path to <code>fastqcv0119.sif<\/code>. In <code>profile\/config.yaml<\/code>, add the section <code>use-singularity: True<\/code> and <code>singularity-args: \"--bind mypath\/snakemake-profile-singularity\"<\/code> (<strong>do not forget to modify <code>mypath<\/code><\/strong>). The <code>--bind<\/code> instruction enables the singularity to access the input files (inputs\/hello.txt and inputs\/helloBis.txt).<br>The binding folder should always be higher than your files in the folder hierarchy. Copy the following content to <code>profile\/config.yaml<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>---\n\nsnakefile: snakeFile\n\nlatency-wait: 60\nreason: True\nshow-failed-logs: True\nkeep-going: True\nprintshellcmds: True\n\n# Cluster submission\njobname: \"{rule}.{jobid}\"              # Provide a custom name for the jobscript that is submitted to the cluster.\nmax-jobs-per-second: 1                 #Maximal number of cluster\/drmaa jobs per second, default is 10, fractions allowed.\nmax-status-checks-per-second: 10       #Maximal number of job status checks per second, default is 10\njobs: 400                              #Use at most N CPU cluster\/cloud jobs in parallel.\ncluster: \"sbatch --output=\\\"jobs\/{rule}\/slurm_%x_%j.out\\\" --error=\\\"jobs\/{rule}\/slurm_%x_%j.log\\\" --mem={resources.mem_mb} --time={resources.runtime} --parsable\"\ncluster-status: \".\/profile\/status-sacct.sh\" #  Use to handle timeout exception, do not forget to chmod +x\n\n# singularity\nuse-singularity: True\nsingularity-args: \"--bind mypath\/snakemake-profile-singularity\"\n\n# Job resources\nset-resources:\n  - helloSingularity:mem_mb=1000\n  - helloSingularity:runtime=00:03:00\n    \n# For some reasons time needs quotes to be read by snakemake\ndefault-resources:\n  - mem_mb=500\n  - runtime=\"00:01:00\"\n  \n# Define the number of threads used by rules\nset-threads:\n  - helloSingularity=1\n<\/code><\/pre>\n\n\n\n<p>Create a <code>profile\/status-sacct.sh<\/code> (see the previous posts for details) with the following content:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/env bash\n\n# Check status of Slurm job\n\njobid=\"$1\"\n\nif &#91;&#91; \"$jobid\" == Submitted ]]\nthen\n  echo smk-simple-slurm: Invalid job ID: \"$jobid\" &gt;&amp;2\n  echo smk-simple-slurm: Did you remember to add the flag --parsable to your sbatch call? &gt;&amp;2\n  exit 1\nfi\n\noutput=`sacct -j \"$jobid\" --format State --noheader | head -n 1 | awk '{print $1}'`\n\nif &#91;&#91; $output =~ ^(COMPLETED).* ]]\nthen\n  echo success\nelif &#91;&#91; $output =~ ^(RUNNING|PENDING|COMPLETING|CONFIGURING|SUSPENDED).* ]]\nthen\n  echo running\nelse\n  echo failed\nfi\n<\/code><\/pre>\n\n\n\n<p>Make <code>profile\/status-sacct.sh<\/code> executable and perform a run:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/bash\n\nchmod +x profile\/status-sacct.sh\nsnakemake --profile profile\/\n<\/code><\/pre>\n\n\n\n<p>If in <code>jobs\/helloSingularity\/*log<\/code> you get the error message <code>FATAL: container creation failed: unable to add<\/code>, this means that the path in the <code>singularity-args<\/code> section of <code>profile\/config.yaml<\/code> is incorrect. Otherwise, you should see in the log files:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Activating singularity image singularities\/fastqcv0119.sif<\/code><\/pre>\n\n\n\n<p>This line confirms that the code of your rule was run in your singularity environment.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous posts, we saw how to&nbsp;get started with snakemake,&nbsp;reduce command-line options, submit your jobs to a cluster, define resources and threads and handling memory and timeout errors. In this last post about snakemake profiles, I will show how to use singularity containers. If&hellip;<\/p>\n","protected":false},"author":5,"featured_media":764,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[4096],"tags":[4100,4098],"embl_taxonomy":[],"class_list":["post-748","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technical","tag-profile","tag-snakemake"],"acf":[],"embl_taxonomy_terms":[],"featured_image_src":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-content\/uploads\/2022\/05\/banksy.jpg","_links":{"self":[{"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/posts\/748"}],"collection":[{"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/comments?post=748"}],"version-history":[{"count":9,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/posts\/748\/revisions"}],"predecessor-version":[{"id":768,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/posts\/748\/revisions\/768"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/media\/764"}],"wp:attachment":[{"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/media?parent=748"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/categories?post=748"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/tags?post=748"},{"taxonomy":"embl_taxonomy","embeddable":true,"href":"https:\/\/www.embl.org\/groups\/bioinformatics-rome\/wp-json\/wp\/v2\/embl_taxonomy?post=748"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}