anvi-setup-globdb-functions

Download and set up the GlobDB gene family database for functional annotation.

🔙 To the main page of anvi’o programs and artifacts.

Authors

Can consume

This program seems to know what its doing. It needs no input material from its user. Good program.

Can provide

globdb-data

Usage

This program downloads and sets up a local copy of the GlobDB gene family database for use in functional annotation with anvi-run-globdb-functions. It produces a globdb-data artifact.

Basic usage

anvi-setup-globdb-functions

We recommend using --num-threads to speed up the DIAMOND database build step.

If you already have a globdb-data artifact and want to re-download and rebuild everything from scratch:

anvi-setup-globdb-functions --reset

Custom data directory

By default, anvi’o stores the GlobDB data in a location inside the anvi’o package directory. If you do not have write access to that location, or if you want to keep the data elsewhere, use:

anvi-setup-globdb-functions --globdb-data-dir /path/to/your/directory

You can also set the environment variable ANVIO_GLOBDB_DATA_DIR to your preferred path so anvi’o will use it automatically without requiring the --globdb-data-dir flag each time:

export ANVIO_GLOBDB_DATA_DIR=/path/to/your/directory anvi-setup-globdb-functions

What happens during setup

  1. The GlobDB data package (that is maintained by GlobDB folk, including Daan Speth et al) is downloaded and extracted.
  2. Every gene family info.yaml file is validated for required fields (gene_family, description, version, and cutoffs including lasr, selfmax, selfmin, and matrix). Where present, synteny.yaml files are also validated.
  3. All per-family FASTA files are concatenated into a single GlobDB.faa (with GAA identifiers prepended to sequence headers).
  4. All per-family info.yaml files are merged into a single GlobDB-gene-family-data.yaml. All per-family synteny.yaml files (where present) are merged into a single GlobDB-synteny-data.yaml.
  5. A DIAMOND search database is built from GlobDB.faa.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.