anvi-gen-genomes-storage

Create a genome storage from internal and/or external genomes for a pangenome analysis.

🔙 To the main page of anvi’o programs and artifacts.

Authors

Can consume

external-genomes internal-genomes

Can provide

genomes-storage-db

Usage

This program generates a genomes-storage-db, which stores information about your genomes, primarily for use in pangenomic analysis.

Genomes storage databases are to Anvi’o’s pangenomic workflow what a contigs-db is to a metagenomic workflow: it stores vital information and is passed to most programs you’ll want to run.

Once you’ve generated a genomes-storage-db, you can run anvi-pan-genome, which creates a pan-db and runs various pangenomic analyses (including calculating the similarities between your sequences, identifying gene clusters, and organizing your gene clusters and genomes). After that, you can display your pangenome with anvi-display-pan For more information, check out the pangenomic workflow.

Inputs: internal and external genomes

You can initialize your genomes storage database with internal-genomes, external-genomes, or both.

internal-genomes describe genomes that are described by a bin within a collection that is already within an Anvi’o profile-db. For example, if you had gone through the metagenomic workflow and had several MAGs that you wanted to run pangenomic analyses on.

anvi-gen-genomes-storage -i internal-genomes \ -o genomes-storage-db

The name of your genomes storage database (which follows the -o flag) must end with -GENOMES.db. This just helps differenciate it from other types of Anvi’o databases, such as the contigs-db and profile-db.

In contrast, external-genomes describe genomes that are contained in a fasta file that you’ve turned into a contigs-db (using anvi-gen-contigs-database). For example, if you had downloaded genomes from NCBI.

anvi-gen-genomes-storage -e external-genomes \ -o genomes-storage-db

You can also create a genomes storage database from both types of genomes at the same time. For example, if you had MAGs from a metagenomic analysis on an environmental sample and wanted to compare them with the reference genomes on NCBI. To run this, simply provide both types of genomes as parameters, as so:

anvi-gen-genomes-storage -i internal-genomes \ -e external-genomes \ -o genomes-storage-db

Changing the gene caller

By default, Anvi’o will use Prodigal and will let you know if you have gene calls identified by other gene callers. However, you are welcome to explicitly use a specific gene caller with the flag --gene-caller.

If you’re wondering what gene callers are available in your contigs-db, you can check by running the program anvi-export-gene-calls on a specific contigs-db with the flag --list-gene-callers.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.