Skip to content

Read classification

Viralgenie offers two main tools for the classification of reads and a summary visualisation tool:

  • Kaiju: Taxonomic classification based on maximum exact matches using protein alignments.
  • Kraken2: Assigns taxonomic labels on a DNA level using a k-mer approach. (optional Bracken)
  • Krona: Interactive multi-layered pie charts of hierarchical data.

metagenomic_diversity

Want more classifiers?

Feel free to reach out and suggest more classifiers. However, if the main goal of your project is to establish the presence of a virus within a sample and are therefore only focused on metagenomic diversity, have a look at taxprofiler

The read classification can be skipped with the argument --skip_read_classification, classifiers should be specified with the parameter --read_classifiers 'kaiju,kraken2' (no spaces, no caps). See the parameters classification section for all relevant arguments to control the classification steps.

Kaiju

Kaiju classifies individual metagenomic reads using a reference database comprising the annotated protein-coding genes of a set of microbial genomes. It employs a search strategy, which finds maximal exact matching substrings between query and database using a modified version of the backwards search algorithm in the Burrows-Wheeler transform. The Burrows-Wheeler transform is a text transformation that converts the reference sequence database into an easily searchable representation, which allows for exact string matching between a query sequence and the database in time proportional to the length of the query.

Kraken2

Kraken2 is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the k-mers within a query sequence and uses the information within those k-mers to query a database. That database maps k-mers to the lowest common ancestor (LCA) of all genomes known to contain a given k-mer.

Bracken can optionally be enabled for more accurate estimation of abundances, although these values should be interpreted with caution as viruses don't have marker genes making it difficult to compare abundances across samples & taxa. --read_classifiers 'kraken2,bracken' (no spaces, no caps)

Krona

Krona allows hierarchical data to be explored with zooming, multi-layered pie charts. The interactive charts are self-contained and can be viewed with any modern web browser.