A Virus Taxonomy Classification Framework
Case Study: Densovirinae Example
Tree visualisation features in ViCTreeView
Once ViCTree framework is set up, it is easy to automatically update the phylogeny using scheduler utility such as cron.
An example cronjob that runs the ViCTree pipeline for Densovirinae can be set up by editing user's crontab with crontab -e
command.
@monthly /home/username/victreecron.sh
where victreecron.sh is a simple script that exists at the specified location and contains the actual ViCTree command required to run the analysis.
For example,
cd /path/to/victree/executables
bash ViCTree.sh -t 40120 -s txid40120_seeds.fa -l 100 -c 50 -p 10 -i 1.0 -n Densovirinae
Note: Please ensure that all .bashrc
, .bash_profile
and bash.bashrc
files are sourced in the crontab as the cron's PATH environment is different from regular bash terminal PATH.
It is possible to visualise different versions of a phylogeny in ViCTreeView to study incorporation of new sequences over time.
To achieve this, each version of the phylogeny should be tagged with a unique identifier by changing value of $name
in following lines of code in the ViCTree.sh.
cp ${tid}/${tid}.nhx ViCTreeView/data/${name}.nhx
cp ${tid}/${tid}_label.tsv ViCTreeView/data/${name}_label.tsv
cp ${tid}/${tid}.csv ViCTreeView/data/${name}.csv
git add data/${name}.nhx data/${name}.csv data/${name}_label.tsv
git commit -m "Data files updated for $name"
These changes would add a new version of the files with a unique identifier in the ViCTreeView's data folder.
A ViCTreeView instance that demonstrates visualisation of multiple phylogenies of Densovirinae is set up on: http://bioinformatics.cvr.ac.uk/victree-densovirinae Three separate versions of Densovirinae phylogeny can be browsed using the pull down menu labelled Examples.
It is possible to visualise a specific version of the phylogeny in ViCTreeView. To achieve this update branch
parameter to a specific branch version in your forked repository in the following lines in the file index.html in ViCTreeView sub-directory.
var user_id = "josephhughes"
var repo_name = "ViCTree"
var branch = "master" #change the branch name to specific version you would like to visualise
var dir = "ViCTreeView/data"
We have implemented a Google maps like zoom-in and zoom-out approach. It is possible to zoom in/out using the mouse scroll button in ViCTreeView interface. It is also possible to zoom-in by clicking the left mouse button twice. It is fully tested and works best in Google chrome, Safari and Firefox web browsers.
The automated phylogeny generated within ViCTree is mid-point rooted. However, ViCTreeView provides an option to re-root the tree to any nodes.
To reroot a tree, click on the white circle on the branch of interest and click on Re-root to change the root of the tree.
When a specific 'Pairwise Distance' value is selected in the ViCTreeView interface, any branches within specified distance criteria are highlighted in different coloured clusters.
ViCTree github repository contains two scripts: shuffle.sh
and validate.sh
that can be used to select correct parameters for a new viral group of interest.
shuffle.sh
takes three input parameters:
validate.sh
uses output files generated from the shuffle.sh
and a file with the list of all currently classified sequence accession numbers.
It generates a tabular output file with following details.
NumberOfSeeds RunNo Length Coverage SeqFound FalsePos FalseNeg
Sejal Modha, Anil Thanki, Susan F Cotmore, Andrew J Davison, Joseph Hughes; ViCTree: An automated framework for taxonomic classification from protein sequences, Bioinformatics , bty099, https://doi.org/10.1093/bioinformatics/bty099
The ViCTree framework is developed by : Sejal Modha (@sejmodha), Anil Thanki (@anilthanki) and Joseph Hughes (@josephhughes). Contact