Skip to content

Read classification

Viralgenie offers two main tools for the classification of reads and a summary visualisation tool:

  • Kaiju: Taxonomic classification based on maximum exact matches using protein alignments.
  • Kraken2: Assigns taxonomic labels on a DNA level using a k-mer approach. (optional Bracken)
  • Krona: Interactive multi-layered pie charts of hierarchial data.

metagenomic_diversity

Want more classifiers?

Feel free to reach out and suggest more classifiers. However, if the main goal of your project is to establish the presence of a virus within a sample and are therefore only focused on metagenomic diversity, have a look at taxprofiler

The read classification can be skipped with the argument --skip_read_classification, classifiers should be specified with the parameter --read_classifiers 'kaiju,kraken2' (no spaces, no caps). See the parameters classification section for all relevant arguments to control the classification steps.

Kaiju

Kaiju classifies individual metagenomic reads using a reference database comprising the annotated protein-coding genes of a set of microbial genomes. It employ a search strategy, which finds maximal exact matching substrings between query and database using a modified version of the backwards search algorithm in the Burrows-Wheeler transform is a text transformation that converts the reference sequence database into an easily searchable representation, which allows for exact string matching between a query sequence and the database in time proportional to the length of the query.

Kraken2

Kraken is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the k-mers within a query sequence and uses the information within those k-mers to query a database. That database maps -mers to the lowest common ancestor (LCA) of all genomes known to contain a given k-mer.

Bracken can optionally be enabled for more accurate estimation of abundancies, although these values should be interpreted with caution as viruses don't have a marker genes making it difficult to compare abundances across samples & taxa. --read_classifiers 'kraken2,bracken' (no spaces, no caps)

Krona

Krona allows hierarchical data to be explored with zooming, multi-layered pie charts. The interactive charts are self-contained and can be viewed with any modern web browser.