Welcome to the Taxonium documentation#

Taxonium is a powerful tool for exploring phylogenetic trees.

Getting started#

How you use Taxonium depends on what you want to do.

Viewing a Newick tree#

If you have a tree in Newick format and you just want to view it, you can just go to Taxonium.org, and select your tree file.

Adding metadata to your tree#

Optionally, you can also upload a metadata file with your tree. This file should be in TSV or CSV format. It should have a heading with column names. The left-most column should contain the node names as used in the Newick file. The remaining columns should contain metadata for each node.

Note

All files supplied to Taxonium can also have .gz extensions, indicating they are gzipped.

Viewing the global SARS-CoV-2 tree#

We maintain an instance of Taxonium that displays a version of the UShER-built SARS-CoV-2 global tree, at Cov2Tree.org.

Using an UShER mutation annotated tree#

Taxonium is especially powerful when applied to a tree that has been annotated with mutations. With these trees it can allow searching for mutations, or displaying genotypes. Such trees are often generated by UShER. Please refer to the UShER documentation for advice on how to make such a tree.

Note

Sometimes you might just want to annotate an existing SARS-CoV-2 tree. You can download MATs pre-built by the UShER team from here, which you can add your own metadata to using taxonium_to_usher.

Once you have an UShER-annotated tree, we provide a tool for converting it to a format that Taxonium can use. The Taxonium format is a JSONL file with a list of nodes, each with all of its metadata, and a position. To create such a file we can use the usher_to_taxonium tool, from the taxoniumtools package.

Installing taxoniumtools#

Taxoniumtools is available from PyPI. You can install it with pip.

pip install taxoniumtools

The usher_to_taxonium utility will then be available for use.

Using usher_to_taxonium from taxoniumtools#

Example#

First get some files:

wget https://github.com/theosanderson/taxonium/raw/master/taxoniumtools/test_data/tfci.meta.tsv.gz
wget https://raw.githubusercontent.com/theosanderson/taxonium/master/taxoniumtools/test_data/hu1.gb
wget https://github.com/theosanderson/taxonium/raw/master/taxoniumtools/test_data/tfci.pb

Then convert from UShER pb format to Taxonium jsonl format:

usher_to_taxonium --input tfci.pb --output tfci-taxonium.jsonl.gz --metadata tfci.meta.tsv.gz --genbank hu1.gb \
--columns genbank_accession,country,date,pangolin_lineage

You can then open that tfci-taxonium.jsonl.gz file at taxonium.org

Full documentation for usher_to_taxonium#

Convert a Usher pb to Taxonium jsonl format

usage: usher_to_taxonium [-h] --input INPUT --output OUTPUT
                         [--metadata METADATA] --genbank GENBANK
                         [--chronumental]
                         [--chronumental_steps CHRONUMENTAL_STEPS]
                         [--columns COLUMNS]
                         [--chronumental_date_output CHRONUMENTAL_DATE_OUTPUT]
                         [--chronumental_tree_output CHRONUMENTAL_TREE_OUTPUT]
                         [--chronumental_reference_node CHRONUMENTAL_REFERENCE_NODE]
                         [--config_json CONFIG_JSON] [--title TITLE]
                         [--overlay_html OVERLAY_HTML] [--remove_after_pipe]
Named Arguments#
--input

Input Usher pb file

--output

Output jsonl file

--metadata

Metadata file

--genbank

Genbank file

--chronumental

If set, we will run chronumental

Default: False

--chronumental_steps

Number of steps to run chronumental for

--columns

Columns to include in the metadata

--chronumental_date_output

Output file for the chronumental date file, if any

--chronumental_tree_output

Output file for the chronumental tree file, if any

--chronumental_reference_node

Taxonium reference node

--config_json

A JSON file to use as a config file

--title

A title for the tree

--overlay_html

A file containing HTML to put in the overlay

--remove_after_pipe

If set, remove anything after a pipe (|) in each node’s name

Default: False

Deploying your own Taxonium backend#

All of the description above involves the full tree being processed wholly locally in your own browser. For very large trees, this can mean a lot of memory and that the initial loading process is quite slow. To solve this issue, you can deploy your own Taxonium backend which will run continually in some cloud server, ready to receive traffic and emit a small part of the tree to a client.

This is probably best done with Docker:

or