This page is for users who want to install the development version of anvi’o, anvio-dev
, on Mac OSX.
This section is not quite meant to be followed by those who would define themselves as end users in a conventional sense. But we are not the kinds of people who would dare to tell anyone what they can and cannot do. FWIW, our experience suggests that if you are doing microbiology, you will do computers no problem if you find dem computers exciting.
If you follow these steps, you will have anvi’o setup on your system in such a way, every time you initialize your anvi’o environment you will get the very final state of the anvi’o code. Plus, you can have both the stable and active anvi’o on the same computer.
Nevertheless, it is important to keep in mind that there are multiple advantages and disadvantages to working with the active development branch. Advantages are obvious and include,
Full access to all new features and bug fixes in real-time, without having to wait for stable releases to be announced.
A working system to hack anvi’o and/or add new features to the code (this strategy is exactly how we develop anvi’o and use it for our science at the same time at our lab).
In contrast, disadvantages include,
Unstable intermediate states may frustrate you with bugs, and in extremely rare instances loss of data (this happened only once so far during the last five years, and required one of our users to re-generate their contigs databases).
Difficulty to mention the anvi’o version in a paper. Although this can easily be solved by sharing not the version number of anvi’o but the cryptographic hash of the last commit for reproducibility. If you ever struggle with this, please let us know and we will help you.
If you are still here, let’s start.
You will need to run the installation commands from a terminal. Mac OSX comes with a basic Terminal application, or you can download and use a fancier one (such as iTerm).
Some of the packages we use need compiling during the installation process, so you should also make sure that you have Xcode Command Line Tools installed and up-to-date. Here is a quick link to their installation instructions. If you have to re-install the Command Line Tools, please remember to close your terminal window and open a new one before continuing with the anvi’o installation (a big thank you to Hilary Morrison for that tip).
You also need miniconda to be installed on your system. If you don’t already have it, please follow their installation instructions.
Please note that we recently switched from Python 3.7 to Python 3.10 in our active development branch. Thus, the way we setup the conda environment for the active development branch now differs from the way we do it for the latest stable version. There may be hiccups since these changes required many adjustments in the anvi’o code, and likely some bugs were missed. If you are reading these lines, please keep us posted if you run into an issue.
Working with Apple silicon
If you are using a computer with Apple silicon (like a M1 MacBook), you will find that some conda packages are not available (bioconda packages). To avoid this issue, you can run the following command (only once) before creating the environment:
conda config --env --set subdir osx-64
First make sure you are not in any environment by running conda deactivate
. Then, make sure you don’t have an environment called anvio-dev
(as in anvi’o development):
conda env remove --name anvio-dev
Create a new conda environment:
conda create -y --name anvio-dev python=3.10
And activate it:
conda activate anvio-dev
Now you are in a pristine environment, in which you will install all conda packages that anvi’o will need to work properly. This looks scary, but it will work if you just copy-paste it and press ENTER:
conda install -y -c conda-forge -c bioconda python=3.10 \
sqlite prodigal idba mcl muscle=3.8.1551 famsa hmmer diamond \
blast megahit spades bowtie2 bwa graphviz "samtools>=1.9" \
trimal iqtree trnascan-se fasttree vmatch r-base r-tidyverse \
r-optparse r-stringi r-magrittr bioconductor-qvalue meme ghostscript \
nodejs
# try this, if it doesn't install, don't worry (it is sad, but OK):
conda install -y -c bioconda fastani
If you see any error messages in the output indicating that a package failed to install, you should check the ‘Common problems’ section below or search for it in the anvi’o issues page (make sure to check the ‘Closed’ issues as well) to see if we already found a solution for the error.
If you are here, it means you have a conda environment with everything except anvi’o itself. We will make sure this environment has anvi’o by getting a copy of the anvi’o codebase from GitHub.
Here I will suggest ~/github/
as the base directory to keep the code, but you can change if you want to something else (in which case you must remember to apply that change all the following commands, of course). Setup the code directory:
mkdir -p ~/github && cd ~/github/
Get the anvi’o code:
If you only plan to follow the development branch, and not make changes to the codebase, you can skip this message. But if you are not an official anvi’o developer yet intend to change anvi’o and send us pull requests to reflect those changes in the official repository, you may want to clone anvi’o from your own fork rather than using the following URL. Thank you very much in advance and we are looking forward to seeing your PR!
git clone --recursive https://github.com/merenlab/anvio.git
Some packages in requirement.txt
may require to be installed with a more up to date c-compiler on Mac OSX. Hence, we suggest all Mac users to run the following commands before you start the pip install
command:
export CC=/usr/bin/clang
export CXX=/usr/bin/clang++
Finally, to install the Python dependencies of anvi’o, please run the following command:
cd ~/github/anvio/
pip install -r requirements.txt
Now you have the latest copy of the anvi’o codebase, and all of its dependencies are in place.
Now we have the codebase and we have the conda environment, but they don’t know about each other.
Here we will setup your conda environment in such a way that every time you activate it, you will get the very latest updates from the main anvi’o repository. While you are still in anvi’o environment, copy-paste these lines into your terminal:
cat <<EOF >${CONDA_PREFIX}/etc/conda/activate.d/anvio.sh
# creating an activation script for the the conda environment for anvi'o
# development branch so (1) Python knows where to find anvi'o libraries,
# (2) the shell knows where to find anvi'o programs, and (3) every time
# the environment is activated it synchronizes with the latest code from
# active GitHub repository:
export PYTHONPATH=\$PYTHONPATH:~/github/anvio/
export PATH=\$PATH:~/github/anvio/bin:~/github/anvio/sandbox
echo -e "\033[1;34mUpdating from anvi'o GitHub \033[0;31m(press CTRL+C to cancel)\033[0m ..."
cd ~/github/anvio && git pull && cd -
EOF
If you are using zsh
by default these may not work. If you run into a trouble here or especially if you figure out a way to make it work both for zsh
and bash
, please let us know. To use bash
to make the above command work, first run this exec bash
command. Then re-run the command above. To go back to zsh
you can run exec zsh
command.
If everything worked, you should be able to type the following commands in a new terminal and see similar outputs:
meren ~ $ conda activate anvio-dev
Updating from anvi'o GitHub (press CTRL+C to cancel) ...
(anvio-dev) meren ~ $ which anvi-self-test
/Users/meren/github/anvio/bin/anvi-self-test
(anvio-dev) meren ~ $ anvi-self-test -v
Anvi'o .......................................: hope (v7.1-dev)
Python .......................................: 3.10.13
Profile database .............................: 38
Contigs database .............................: 21
Pan database .................................: 16
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 4
tRNA-seq database ............................: 2
(anvio-dev) meren ~ $
If that is the case, you’re all set.
Every change you will make in anvi’o codebase will immediately be reflected when you run anvi’o tools (but if you change the code and do not revert back, git will stop updating your branch from the upstream).
If you followed these instructions, every time you open a terminal you will have to run the following command to activate your anvi’o environment:
conda activate anvio-dev
If you are here, you can now jump to “Check your anvi’o setup” to see if things worked for you using anvi-self-test
, but don’t forget to take a look at the bonus chapter below, especially if you are using bash
.
If you are here, you are ready to check if everything is working on your system. This section will help you finalize your installation so you are prepared for anything.
The easiest way to check your installation is to run the anvi’o program anvi-self-test:
anvi-self-test --suite mini
If you don’t want anvi’o to show you a browser window at the end and quietly finish testing if everything is OK, add --no-interactive
flag to the command above. Another note, anvi-self-test
is run in --suite mini
mode, which tests the absolute minimal features of your anvi’o installation. If you run it without any parameters, it will tests many more things.
If everything goes smoothly, your browser should pop-up and show you an anvi’o interactive interface that looks something like this once anvi-self-test
is done running:
The screenshot above is from 2015 and will be vastly different from the interactive interface you should see in your browser. It is still here so we remember where we came from 😇
If you are seeing the interactive interface, it means you now have a computer that can run anvi’o! In theory you can leave this page at this moment, but there are a few more details that would be best to attend now. So please bear with this tutorial just a little longer.
Don’t forget to come say hi to us on anvi’o Discord.
This is to further prepare your anvi’o installation for things you may need later, such as databases for taxonomic annotation of your genomes or functional annotation of your genes. This is an up-to-date list of programs that you should run in your terminal to have everything ready:
anvi-self-test --suite pangenomics
to see if everything is order, especially if you plan to use anvi’o for pangenomics.You can skip this section if you are not interested in reconstructing genomes from metagenomes using anvi’o.
Anvi’o offers a powerful interactive environment to reconstruct genomes from metageomes where you have full control over subtle decisions. For small assemblies (i.e., where you have less than 25,000 contigs), you do not need an additional binning software to reconstruct genomes from metagenomes. But for larger metagenomes, you have two options:
The following recipe will help you install CONCOCT on your system just so there is an automatic binning algorithm ready on your system that you can use with anvi-cluster-contigs:
# setup a place to download CONCOCT source code
mkdir -p ~/github/ && cd ~/github/
# get a clone of the CONCOCT codebase from the fork
# that is tailored for the anvi'o conda environment
git clone https://github.com/merenlab/CONCOCT.git
# build and install
cd CONCOCT
python setup.py build
python setup.py install
Please note that you may encounter an error when running CONCOCT due to a TypeError
. Please see the report #2154 for more information regarding this issue. IF you run into this issue, you may be able to resolve it by running the following command in your anvi’o conda environment: pip install scikit-learn==1.1.0
. developed and tested this solution, and confirmed that it works at least for v8
. But please let us know if this fix breaks any other part of anvi’o :)
If everything worked, when you type the following command,
anvi-cluster-contigs -h
You should see this output (where CONCOCT is found):
If you are a developer of an automatic binning algorithm and would like to see it in anvi’o, please get in touch with us. Anvi’o can pass any information about sequences (their coverages across samples, tetranucleotide frequencies, genes, functions, and whatever else you would like to have about them) to any program to run it on user data and import the results into anvi’o databases seamlessly through simple Python wrappers. Here are some examples of such wrappers for CONCOCT, for BinSanity, and for MaxBin2. If you wish to create one but are not sure how to test it, please start a GitHub issue.
If your browser didn’t show up, or testing stopped with errors, please take a look at the common problems others have reported and try these solutions. Please remember you can always come to anvi’o Discord to ask for help if things are not working for you and the answers you find here are no use.
It is absolutely normal to see ‘warning’ messages. In general anvi’o is talkative as it would like to keep you informed. In an ideal world you should keep a careful eye on those warning messages, but in most cases they will not require action.
If anvi-self-test fails with an error message that looks something like this,
libcrypto.so.1.0.0: cannot open shared object file: no such file or directory
it is likely that the pysam
module installation failed. To fix this you should revisit the installation instructions, especially the part that says “Issues related to samtools”, and then come back to testing.
If your browser does not show up, or does show up but can’t show anything due to a ‘network problem’, you may also want to visit the address http://localhost:8080 by manually entering this address to your browser’s address bar, which should work on your local computer. On some systems the default network interface anvi’o uses to connect to its own server causes issues. You may also find the help page for anvi-interactive useful for future references.
If your browser does not show up while you are connected to a remote computer, it is quite normal. In some cases a text-based browser may show up instead of your graphical browser, too. This is becasue you are running anvi’o on another computer, and it tries to open a browser there. You can set things up for anvi’o to use your local browser to access to an anvi’o interactive interactive interface running remotely. For that, you can read this article (or ask your systems administrator to read it) to learn how you can forward displays from servers to your personal computer.
If you are not using Chrome as your default browser, anvi’o will complain about it :/ We hate the idea of asking you to change your browser preferences for anvi’o :( But currently, Chrome maintains the most efficient SVG engine among all browsers we tested as of 2021. For instance, Safari can run the anvi’o interactive interface, however it takes orders of magnitude more time and memory compared to Chrome. Firefox, on the other hand, doesn’t even bother drawing anything at all. Long story short, the anvi’o interactive interface will not perform optimally with anything but Chrome. So you need Chrome. Moreover, if Chrome is not your default browser, every time interactive interface pops up, you will need to copy-paste the address bar into a Chrome window.
You can learn what is your default browser by running this command in your terminal:
python -c 'import webbrowser as w; w.open_new("http://")'
If you open a new terminal and get command not found error when you run anvi’o commands, it means you need to activate anvi’o conda environment by running the following command (assuming that you named your conda environment for anvio as anvio-8
, but you can always list your conda environments by running conda env list
):
conda activate anvio-8
If you are getting an error that goes like,
Config Error: Something went wrong during the functional enrichment analysis :( We don't know
what happened, but this log file could contain some clues: (...)
it often means that the R libraries that are needed to run functional enrichment analyses are not installed properly through conda :/ Luckily, you can try to install them using the R terminal as Marco Gabrielli shared on anvi’o Discord. For this, try running this command in your terminal:
Rscript -e 'install.packages(c("stringi", "tidyverse", "magrittr", "optparse"), repos="https://cloud.r-project.org")'
If everything goes alright, you can quit the R terminal by pressing CTRL+D
twice. Once you are out, you can run this command to see if everything runs smoothly:
Rscript -e "library('tidyverse')"
In some cases the problem is the qvalue
package, which can be a pain to install. If you are having hard time with that one, you can try this and see if that solves it:
Rscript -e 'install.packages("BiocManager", repos="https://cran.rstudio.com"); BiocManager::install("qvalue")'
Now you can take a look up some anvi’o resources here, or join anvi’o Discord to be a part of our growing community.
This section is written by Meren and reflects his setup on a Mac system that runs miniconda where bash
is setup as the default shell. If you are using another shell and if you would like to share your solution, please send a PR!
This is all personal taste and they may need to change from computer to computer, but I added the following lines at the end of my ~/.bash_profile
to easily switch between different versions of anvi’o on my Mac system:
# This is where my miniconda base is, you can find out
# where is yours by running this in your terminal:
#
# conda env list | grep base
#
export MY_MINICONDA_BASE="/Users/$USER/miniconda3"
init_anvio_7 () {
deactivate &> /dev/null
conda deactivate &> /dev/null
export PATH="$MY_MINICONDA_BASE/bin:$PATH"
. $MY_MINICONDA_BASE/etc/profile.d/conda.sh
conda activate anvio-7.1
export PS1="\[\e[0m\e[47m\e[1;30m\] :: anvi'o v7.1 :: \[\e[0m\e[0m \[\e[1;32m\]\]\w\[\e[m\] \[\e[1;31m\]>>>\[\e[m\] \[\e[0m\]"
}
init_anvio_dev () {
deactivate &> /dev/null
conda deactivate &> /dev/null
export PATH="$MY_MINICONDA_BASE/bin:$PATH"
. $MY_MINICONDA_BASE/etc/profile.d/conda.sh
conda activate anvio-dev
export PS1="\[\e[0m\e[40m\e[1;30m\] :: anvi'o v7.1 dev :: \[\e[0m\e[0m \[\e[1;34m\]\]\w\[\e[m\] \[\e[1;31m\]>>>\[\e[m\] \[\e[0m\]"
}
alias anvio-7.1=init_anvio_7
alias anvio-dev=init_anvio_dev
You can either open a new terminal window or run source ~/.bash_profile
to make sure these changes take effect. Now you should be able to type anvio-7.1
to initialize the stable anvi’o, and anvio-dev
to initialize the development branch of the codebase.
Here is what I see in my terminal for anvio-7.1
:
meren ~ $ anvi-self-test -v
-bash: anvi-self-test: command not found
meren ~ $ anvio-7.1
:: anvi'o v7.1 :: ~ >>>
:: anvi'o v7.1 :: ~ >>> anvi-self-test -v
Anvi'o .......................................: hope (v7.1)
Profile database .............................: 38
Contigs database .............................: 20
Pan database .................................: 15
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 2
tRNA-seq database ............................: 2
Or for anvio-dev
:
meren ~ $ anvi-self-test -v
-bash: anvi-self-test: command not found
:: anvi'o v7.1 :: ~ >>> anvio-dev
:: anvi'o v7.1 dev :: ~ >>>
:: anvi'o v7.1 dev :: ~ >>> anvi-self-test -v
Anvi'o .......................................: hope (v7.1-dev)
Python .......................................: 3.10.12
Profile database .............................: 38
Contigs database .............................: 21
Pan database .................................: 16
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 4
tRNA-seq database ............................: 2
But please note that both aliases run deactivate
and conda deactivate
first, and they may not work for you especially if you have a fancy setup.
If you find a mistake on this page or would you like to update something in it, please feel free to edit its source by clicking the edit button at the top-right corner (which you will see if you are logged in to GitHub) 😇