Installation
Viralgenie uses Nextflow, and a package/container management system (Docker, singularity or conda) so both need to be installed on the system where you launch your analysis.
New to bioinformatics?
If the word "terminal" brings to mind an airport boarding area, you can become a little lost. This blogpost (up until Configuring an Xserver ...) will help people with little bioinformatic experience setup nextflow and docker on a windows computer.
Software managers: Docker, singularity, and conda
Viralgenie can be run using either Docker, singularity or conda. The choice of container system is up to the user, but it is important to note that Docker and Singularity are the most reproducible. Nextflow supports more containers in addition to Docker and Singularity, such as Podman, Shifter, and Charliecloud. You can read the full list of supported containers and how to set them up here.
When using these containers, Nextflow will use the manager for each process that is executed. In other words, Nextflow will be using docker run
or singularity exec
without the need for you to do anything else.
Docker is a containerisation system that allows you to package your code, tools and data into a single image that can be run on most operating systems. It is the most widely used containerisation system in bioinformatics.
To install Docker, follow the instructions on the Docker website.
Warning
Docker requires root access to run. If you do not have root access like, i.e. a user on a HPC or on a cloud - use Singularity instead.
Singularity is a containerisation system that allows you to package your code, tools and data into a single image that can be run on most operating systems. It is the most widely used containerisation system in bioinformatics.
To install Singularity, follow the instructions on the Singularity website.
Warning
Singularity is a great alternative to Docker but can be challenging to setup on a Apple silicon chip or any other ARM device. If you are using an ARM device, consider using Docker instead.
Conda is a package manager that allows you to install software packages and dependencies in isolated environments. It is a good choice if you are facing issues while installing Docker or Singularity.
- To install Conda, follow the instructions on the Conda website.
- To install Mamba, a faster alternative to Conda, follow the instructions on the Mamba miniforge website.
Warning
Conda environments are great! However, conda tools can easily become broken or incompatible due to dependency issues. For this reason, conda is not as reproducible as Docker or Singularity containers. If you encounter issues with Conda, please try running the pipeline with Docker or Singularity first to see if the issue persists. In other words, if you have a container system, use it over conda!
Nextflow
Nextflow runs on most POSIX systems (Linux, macOS, etc) and requires java 11 or later. It can be installed in several ways, including using the Nextflow installer or Bioconda.
Tip
Unsure how to install Nextflow with these commands? Check out the Nextflow installation documentation for more information.
First, set up Bioconda according to the Bioconda documentation, notably setting up channels:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
A best practice with conda is to create an environment and install the tools in it. Therefore you will prevent version conflicts and keep everything clean. To do so use the following command:
To deactivate the conda environment, run the following command:
If you're already in the conda environment you want to use, you can just install Nextflow directly:
Viralgenie
If you have both nextflow and a software manager installed, you are all set! You can test the pipeline using the following command:
Note
With the argument -profile <docker/singularity/.../institute>
, you can specify the container system you want to use. The test
profile is used to run the pipeline with a small dataset to verify if everything is working correctly.
Running nextflow on a High performance computing (HPC) system?
You might not be the first person to run a nextflow pipeline on your infrastructure! Check out the nf-core configuration website as it might already contain a specific configuration for your infrastructure.
Apple silicon (ARM)
If you are using an Apple silicon (ARM) machine, you may encounter issues. Most tools are not yet compatible with ARM architecture, therefore conda will most likely fail. In this case, use Docker in combination with the profile arm
.