Help pages for anvi'o programs and artifacts

Here you will find a list of all anvi’o programs and artifacts that enable constructing workflows for integrated multi β€˜omics investigations.

If you need an introduction to the terminology used in β€˜omics research or in anvi’o, please take a look at our vocabulary page. The anvi’o community is with you! If you have practical, technical, or science questions this page to learn about resources available to you. If you are feeling overwhelmed, you can always scream towards the anvi’o Discord channel.

Questions? Concerns? Find us on

The help contents were last updated on 14 Jan 22 11:31:09 for anvi’o version 7 (hope).

The latest version of anvi’o is v8. See the release notes.

Anvi’o artifacts

Anvi’o artifacts represent concepts, file types, or data types anvi’o programs can work with. A given anvi’o artifact can be provided by the user (such as a FASTA file), produced by anvi’o (such as a profile database), or both (such as phylogenomic trees). Anvi’o artifacts link anvi’o programs to each other to build novel workflows.

Listed below a total of 105 artifacts.

pan-db contigs-db trnaseq-db modules-db structure-db pdb-db kegg-data single-profile-db profile-db genes-db genomes-storage-db
fasta contigs-fasta trnaseq-fasta concatenated-gene-alignment-fasta short-reads-fasta genes-fasta locus-fasta
configuration-ini external-gene-calls protein-structure-txt samples-txt fasta-txt collection-txt misc-data-items-txt misc-data-layers-txt misc-data-nucleotides-txt misc-data-amino-acids-txt misc-data-layer-orders-txt misc-data-items-order-txt linkmers-txt gene-calls-txt binding-frequencies-txt functions-txt functional-enrichment-txt view-data layer-taxonomy-txt gene-taxonomy-txt genome-taxonomy-txt external-genomes internal-genomes metagenomes coverages-txt detection-txt variability-profile-txt codon-frequencies-txt aa-frequencies-txt fixation-index-matrix kegg-metabolism augustus-gene-calls vcf blast-table splits-txt genbank-file groups-txt splits-taxonomy-txt hmm-hits-matrix-txt clustering-configuration
bam-file raw-bam-file
contigs-stats genes-stats
hmm-hits completion misc-data-items misc-data-layers misc-data-nucleotides misc-data-amino-acids genome-similarity misc-data-layer-orders misc-data-items-order metapangenome oligotypes functions kegg-functions layer-taxonomy gene-taxonomy genome-taxonomy scgs-taxonomy-db scgs-taxonomy trna-taxonomy-db trna-taxonomy variability-profile split-bins state ngrams pn-ps-data
cogs-data pfams-data interacdome-data
dendrogram phylogeny
state-json workflow-config
contigs-workflow metagenomics-workflow pangenomics-workflow phylogenomics-workflow trnaseq-workflow

Anvi’o programs

Anvi’o programs perform atomic tasks that can be weaved together to implement complete β€˜omics workflows. Please note that there may be programs that are not listed on this page. You can type β€˜anvi-β€˜ in your terminal, and press the TAB key twice to see the full list of programs available to you on your system, and type anvi-program-name --help to read the full list of command line options.

Listed below a total of 118 programs.

πŸ”₯ anvi-analyze-synteny. Extract ngrams, as in 'co-occurring genes in synteny', from genomes.
πŸ§€ genomes-storage-db functions pan-db
πŸ• ngrams
πŸ”₯ anvi-cluster-contigs. A program to cluster items in a merged anvi'o profile using automatic binning algorithms.
πŸ§€ profile-db contigs-db collection
πŸ• collection bin
πŸ”₯ anvi-compute-completeness. A script to generate completeness info for a given list of splits.
πŸ§€ contigs-db splits-txt hmm-source
πŸ”₯ anvi-compute-functional-enrichment. This is a driver program for anvi-script-enrichment-stats, a script that computes enrichment scores and group associations for annotated entities (ie, functions, KEGG Modules) across groups of genomes or samples..
πŸ§€ kegg-metabolism groups-txt misc-data-layers pan-db genomes-storage-db external-genomes internal-genomes
πŸ• functional-enrichment-txt
πŸ”₯ anvi-compute-gene-cluster-homogeneity. Compute homogeneity for gene clusters.
πŸ§€ pan-db genomes-storage-db
πŸ”₯ anvi-compute-genome-similarity. Export sequences from sequence sources and compute a similarity metric (e.g. ANI). If a Pan Database is given anvi'o will write computed output to misc data tables of Pan Database.
πŸ§€ external-genomes internal-genomes pan-db
πŸ• genome-similarity
πŸ”₯ anvi-convert-trnaseq-database. A program that processes one or more anvio' tRNA-seq databases generated by anvi-trnaseq to generate anvi'o contigs and merged profile databases that are accessible to the rest of the tools in anvi'o software ecosystem. Briefly, this program will determine final seed sequences from input tRNA-seq databases, determine their coverages across samples, identify tRNA modification sites, and INDELs associated with transcripts in each sample against the seed sequences and store all these data into resulting databases for interactive visualization of the data or in-depth analysis using other anvi'o frameworks.
πŸ§€ trnaseq-db
πŸ• contigs-db profile-db
πŸ”₯ anvi-db-info. Access self tables, display values, or set new ones totally on your own risk.
πŸ§€ pan-db profile-db contigs-db genomes-storage-db structure-db genes-db
πŸ”₯ anvi-delete-collection. Remove a collection from a given profile database.
πŸ§€ profile-db collection
πŸ”₯ anvi-delete-hmms. Remove HMM hits from an anvi'o contigs database.
πŸ§€ contigs-db hmm-source hmm-hits
πŸ”₯ anvi-delete-misc-data. Remove stuff from 'additional data' or 'order' tables for either items or layers in either pan or profile databases. OR, remove stuff from the 'additional data' tables for nucleotides or amino acids in contigs databases.
πŸ§€ pan-db profile-db misc-data-items misc-data-layers misc-data-layer-orders misc-data-nucleotides misc-data-amino-acids
πŸ”₯ anvi-delete-state. Delete an anvi'o state from a pan or profile database.
πŸ§€ pan-db profile-db state
πŸ”₯ anvi-dereplicate-genomes. Identify redundant (highly similar) genomes.
πŸ§€ external-genomes internal-genomes fasta genome-similarity
πŸ• fasta
πŸ”₯ anvi-display-contigs-stats. Start the anvi'o interactive interactive for viewing or comparing contigs statistics.
πŸ§€ contigs-db
πŸ• contigs-stats interactive svg
πŸ”₯ anvi-display-metabolism. Start the anvi'o interactive interactive for viewing KEGG metabolism data.
πŸ§€ contigs-db kegg-data kegg-functions profile-db collection bin
πŸ• interactive
πŸ”₯ anvi-display-pan. Start an anvi'o server to display a pan-genome.
πŸ§€ pan-db genomes-storage-db
πŸ• collection bin interactive svg
πŸ”₯ anvi-display-structure. Interactively visualize sequence variants on protein structures.
πŸ§€ structure-db variability-profile-txt contigs-db profile-db splits-txt
πŸ• interactive
πŸ”₯ anvi-estimate-genome-completeness. Estimate completion and redundancy using domain-specific single-copy core genes.
πŸ§€ contigs-db profile-db external-genomes collection
πŸ• completion
πŸ”₯ anvi-estimate-metabolism. Reconstructs metabolic pathways and estimates pathway completeness for a given set of contigs.
πŸ§€ contigs-db kegg-data kegg-functions profile-db collection bin external-genomes internal-genomes metagenomes
πŸ• kegg-metabolism
πŸ”₯ anvi-estimate-scg-taxonomy. Estimates taxonomy at genome and metagenome level. This program is the entry point to estimate taxonomy for a given set of contigs (i.e., all contigs in a contigs database, or contigs described in collections as bins). For this, it uses single-copy core gene sequences and the GTDB database.
πŸ§€ profile-db contigs-db scgs-taxonomy collection bin metagenomes
πŸ• genome-taxonomy genome-taxonomy-txt
πŸ”₯ anvi-estimate-trna-taxonomy. Estimates taxonomy at genome and metagenome level using tRNA sequences..
πŸ§€ profile-db contigs-db trna-taxonomy collection bin metagenomes
πŸ• genome-taxonomy genome-taxonomy-txt
πŸ”₯ anvi-experimental-organization. Create an experimental clustering dendrogram..
πŸ§€ clustering-configuration
πŸ• dendrogram
πŸ”₯ anvi-export-collection. Export a collection from an anvi'o database.
πŸ§€ profile-db collection
πŸ• collection-txt
πŸ”₯ anvi-export-contigs. Export contigs (or splits) from an anvi'o contigs database.
πŸ§€ contigs-db
πŸ• contigs-fasta
πŸ”₯ anvi-export-functions. Export functions of genes from an anvi'o contigs database for a given annotation source.
πŸ§€ contigs-db functions
πŸ• functions-txt
πŸ”₯ anvi-export-gene-calls. Export gene calls from an anvi'o contigs database.
πŸ§€ contigs-db
πŸ• gene-calls-txt
πŸ”₯ anvi-export-gene-coverage-and-detection. Export gene coverage and detection data for all genes associated with contigs described in a profile database.
πŸ§€ profile-db contigs-db
πŸ• coverages-txt detection-txt
πŸ”₯ anvi-export-items-order. Export an item order from an anvi'o database.
πŸ§€ pan-db profile-db
πŸ• misc-data-items-order-txt dendrogram phylogeny
πŸ”₯ anvi-export-locus. This program helps you cut a 'locus' from a larger genetic context (e.g., contigs, genomes). By default, anvi'o will locate a user-defined anchor gene, extend its selection upstream and downstream based on the –num-genes argument, then extract the locus to create a new contigs database. The anchor gene must be provided as –search-term, –gene-caller-ids, or –hmm-sources. If –flank-mode is designated, you MUST provide TWO flanking genes that define the locus region (Please see –flank-mode help for more information). If everything goes as plan, anvi'o will give you individual locus contigs databases for every matching anchor gene found in the original contigs database provided. Enjoy your mini contigs databases!.
πŸ§€ contigs-db
πŸ• locus-fasta
πŸ”₯ anvi-export-misc-data. Export additional data or order tables in pan or profile databases for items or layers.
πŸ§€ pan-db profile-db contigs-db misc-data-items misc-data-layers misc-data-layer-orders misc-data-nucleotides misc-data-amino-acids
πŸ• misc-data-items-txt misc-data-layers-txt misc-data-layer-orders-txt misc-data-nucleotides-txt misc-data-amino-acids-txt
πŸ”₯ anvi-export-splits-and-coverages. Export split or contig sequences and coverages across samples stored in an anvi'o profile database. This program is especially useful if you would like to 'bin' your splits or contigs outside of anvi'o and import the binning results into anvi'o using anvi-import-collection program.
πŸ§€ profile-db contigs-db
πŸ• contigs-fasta coverages-txt
πŸ”₯ anvi-export-splits-taxonomy. Export taxonomy for splits found in an anvi'o contigs database.
πŸ§€ contigs-db
πŸ• splits-taxonomy-txt
πŸ”₯ anvi-export-state. Export an anvi'o state into a profile database.
πŸ§€ pan-db profile-db state
πŸ• state-json
πŸ”₯ anvi-export-structures. Export .pdb structure files from a structure database.
πŸ§€ structure-db
πŸ• protein-structure-txt
πŸ”₯ anvi-gen-contigs-database. Generate a new anvi'o contigs database.
πŸ§€ contigs-fasta external-gene-calls
πŸ• contigs-db
πŸ”₯ anvi-gen-fixation-index-matrix. Generate a pairwise matrix of a fixation indices between samples.
πŸ§€ contigs-db profile-db structure-db bin variability-profile-txt splits-txt
πŸ• fixation-index-matrix
πŸ”₯ anvi-gen-gene-consensus-sequences. Collapse variability for a set of genes across samples.
πŸ§€ profile-db contigs-db
πŸ• genes-fasta
πŸ”₯ anvi-gen-gene-level-stats-databases. A program to compute genes databases for a ginen set of bins stored in an anvi'o collection. Genes databases store gene-level coverage and detection statistics, and they are usually computed and generated automatically when they are required (such as running anvi-interactive with --gene-mode flag). This program allows you to pre-compute them if you don't want them to be done all at once.
πŸ§€ profile-db contigs-db collection bin
πŸ• genes-db
πŸ”₯ anvi-gen-genomes-storage. Create a genome storage from internal and/or external genomes for a pangenome analysis.
πŸ§€ external-genomes internal-genomes
πŸ• genomes-storage-db
πŸ”₯ anvi-gen-phylogenomic-tree. Generate phylogenomic tree from aligment file.
πŸ§€ concatenated-gene-alignment-fasta
πŸ• phylogeny
πŸ”₯ anvi-gen-structure-database. Identifies genes in your contigs database that encode proteins that are homologous to proteins with solved structures. If sufficiently similar homologs are identified, they are used as structural templates to predict the 3D structure of proteins in your contigs database.
πŸ§€ contigs-db pdb-db
πŸ• structure-db
πŸ”₯ anvi-gen-variability-network. A program to generate a network description from an anvi'o variability profile (potentially outdated program).
πŸ§€ variability-profile
πŸ”₯ anvi-gen-variability-profile. Generate a table that comprehensively summarizes the variability of nucleotide, codon, or amino acid positions. We call these single nucleotide variants (SNVs), single codon variants (SCVs), and single amino acid variants (SAAVs), respectively.
πŸ§€ contigs-db profile-db structure-db bin variability-profile splits-txt
πŸ• variability-profile-txt
πŸ”₯ anvi-get-aa-counts. Fetches the number of times each amino acid occurs from a contigs database in a given bin, set of contigs, or set of genes.
πŸ§€ splits-txt contigs-db profile-db collection
πŸ• aa-frequencies-txt
πŸ”₯ anvi-get-codon-frequencies. Get amino acid or codon frequencies of genes in a contigs database.
πŸ§€ contigs-db
πŸ• codon-frequencies-txt aa-frequencies-txt
πŸ”₯ anvi-get-sequences-for-gene-calls. A script to get back sequences for gene calls.
πŸ§€ contigs-db genomes-storage-db
πŸ• genes-fasta external-gene-calls
πŸ”₯ anvi-get-sequences-for-gene-clusters. Do cool stuff with gene clusters in anvi'o pan genomes.
πŸ§€ pan-db genomes-storage-db
πŸ• genes-fasta concatenated-gene-alignment-fasta misc-data-items
πŸ”₯ anvi-get-sequences-for-hmm-hits. Get sequences for HMM hits from many inputs.
πŸ§€ contigs-db profile-db external-genomes internal-genomes hmm-source hmm-hits
πŸ• genes-fasta concatenated-gene-alignment-fasta
πŸ”₯ anvi-get-short-reads-from-bam. Get short reads back from a BAM file with options for compression, splitting of forward and reverse reads, etc.
πŸ§€ profile-db contigs-db bin bam-file
πŸ• short-reads-fasta
πŸ”₯ anvi-get-short-reads-mapping-to-a-gene. Recover short reads from BAM files that were mapped to genes you are interested in. It is possible to work with a single gene call, or a bunch of them. Similarly, you can get short reads from a single BAM file, or from many of them.
πŸ§€ contigs-db bam-file
πŸ• short-reads-fasta
πŸ”₯ anvi-get-split-coverages. Export splits and the coverage table from database.
πŸ§€ profile-db contigs-db collection bin
πŸ• coverages-txt
πŸ”₯ anvi-import-collection. Import an external binning result into anvi'o.
πŸ§€ contigs-db profile-db pan-db collection-txt
πŸ• collection
πŸ”₯ anvi-import-functions. Parse and store functional annotation of genes.
πŸ§€ contigs-db functions-txt
πŸ• functions
πŸ”₯ anvi-import-items-order. Import a new items order into an anvi'o database.
πŸ§€ pan-db profile-db misc-data-items-order-txt dendrogram phylogeny
πŸ• misc-data-items-order
πŸ”₯ anvi-import-misc-data. Populate additional data or order tables in pan or profile databases for items and layers, OR additional data in contigs databases for nucleotides and amino acids (the Swiss army knife-level serious stuff).
πŸ§€ pan-db profile-db contigs-db misc-data-items-txt dendrogram phylogeny misc-data-layers-txt misc-data-layer-orders-txt misc-data-nucleotides-txt misc-data-amino-acids-txt
πŸ• misc-data-items misc-data-layers misc-data-layer-orders misc-data-nucleotides misc-data-amino-acids
πŸ”₯ anvi-import-state. Import an anvi'o state into a profile database.
πŸ§€ pan-db profile-db state-json
πŸ• state
πŸ”₯ anvi-import-taxonomy-for-genes. Import gene-level taxonomy into an anvi'o contigs database.
πŸ§€ contigs-db gene-taxonomy-txt
πŸ• gene-taxonomy
πŸ”₯ anvi-import-taxonomy-for-layers. Import layers-level taxonomy into an anvi'o additional layer data table in an anvi'o single-profile database.
πŸ§€ single-profile-db layer-taxonomy-txt
πŸ• layer-taxonomy
πŸ”₯ anvi-init-bam. Sort/Index BAM files.
πŸ§€ raw-bam-file
πŸ• bam-file
πŸ”₯ anvi-inspect. Start an anvi'o inspect interactive interface.
πŸ§€ profile-db contigs-db bin
πŸ• interactive
πŸ”₯ anvi-interactive. Start an anvi'o server for the interactive interface.
πŸ§€ profile-db single-profile-db contigs-db genes-db bin view-data dendrogram phylogeny
πŸ• collection bin interactive svg
πŸ”₯ anvi-matrix-to-newick. Takes a distance matrix, returns a newick tree.
πŸ§€ view-data
πŸ• dendrogram
πŸ”₯ anvi-merge. Merge multiple anvio profiles.
πŸ§€ single-profile-db contigs-db
πŸ• profile-db misc-data-items-order
πŸ”₯ anvi-merge-bins. Merge a given set of bins in an anvi'o collection.
πŸ§€ pan-db profile-db collection bin
πŸ”₯ anvi-meta-pan-genome. Convert a pangenome into a metapangenome.
πŸ§€ internal-genomes pan-db genomes-storage-db
πŸ• metapangenome
πŸ”₯ anvi-migrate. Migrate an anvi'o database or config file to a newer version.
πŸ§€ contigs-db profile-db pan-db genes-db genomes-storage-db structure-db modules-db workflow-config
πŸ”₯ anvi-oligotype-linkmers. Takes an anvi'o linkmers report, generates an oligotyping output.
πŸ§€ linkmers-txt
πŸ• oligotypes
πŸ”₯ anvi-pan-genome. An anvi'o program to compute a pangenome from an anvi'o genome storage.
πŸ§€ genomes-storage-db
πŸ• pan-db misc-data-items-order
πŸ”₯ anvi-profile. Creates a single anvi'o profile database. When it is run on a BAM file, depending on the user parameters, the program quantifies coverage per nucleotide position (and averages them per contig), calculates single-nucleotide, single-codon, and single-amino acid variants, as well as structural variants such as insertion and deletions and stores these data into appropriate tables.
πŸ§€ bam-file contigs-db
πŸ• single-profile-db misc-data-items-order variability-profile
πŸ”₯ anvi-refine. Start an anvi'o interactive interactive to manually curate or refine a genome, whether it is a metagenome-assembled, single-cell, or an isolate genome.
πŸ§€ profile-db contigs-db bin
πŸ• bin
πŸ”₯ anvi-rename-bins. Rename all bins in a given collection (so they have pretty names).
πŸ§€ collection bin profile-db contigs-db
πŸ• collection bin
πŸ”₯ anvi-report-linkmers. Reports sequences stored in one or more BAM files that cover one of more specific nucleotide positions in a reference.
πŸ§€ bam-file
πŸ• linkmers-txt
πŸ”₯ anvi-run-hmms. This program deals with populating tables that store HMM hits in an anvi'o contigs database.
πŸ§€ contigs-db hmm-source
πŸ• hmm-hits
πŸ”₯ anvi-run-interacdome. Run InteracDome on a contigs database.
πŸ§€ contigs-db interacdome-data
πŸ• binding-frequencies-txt misc-data-amino-acids
πŸ”₯ anvi-run-kegg-kofams. Run KOfam HMMs on an anvi'o contigs database.
πŸ§€ contigs-db kegg-data
πŸ• kegg-functions functions
πŸ”₯ anvi-run-ncbi-cogs. This program runs NCBI's COGs to associate genes in an anvi'o contigs database with functions. COGs database was been designed as an attempt to classify proteins from completely sequenced genomes on the basis of the orthology concept..
πŸ§€ contigs-db cogs-data
πŸ• functions
πŸ”₯ anvi-run-pfams. Run Pfam on Contigs Database.
πŸ§€ contigs-db pfams-data
πŸ• functions
πŸ”₯ anvi-run-scg-taxonomy. The purpose of this program is to affiliate single-copy core genes in an anvi'o contigs database with taxonomic names. A properly setup local SCG taxonomy database is required for this program to perform properly. After its successful run, anvi-estimate-scg-taxonomy will be useful to estimate taxonomy at genome-, collection-, or metagenome-level).
πŸ§€ contigs-db scgs-taxonomy-db
πŸ• scgs-taxonomy
πŸ”₯ anvi-run-trna-taxonomy. The purpose of this program is to affiliate tRNA gene sequences in an anvi'o contigs database with taxonomic names. A properly setup local tRNA taxonomy database is required for this program to perform properly. After its successful run, anvi-estimate-trna-taxonomy will be useful to estimate taxonomy at genome-, collection-, or metagenome-level)..
πŸ§€ contigs-db trna-taxonomy-db
πŸ• trna-taxonomy
πŸ”₯ anvi-run-workflow. Execute, manage, parallelize, and troubleshoot entire 'omics workflows and chain together anvi'o and third party programs.
πŸ§€ samples-txt fasta-txt workflow-config
πŸ• contigs-workflow metagenomics-workflow pangenomics-workflow phylogenomics-workflow trnaseq-workflow
πŸ”₯ anvi-scan-trnas. Identify and store tRNA genes in a contigs database.
πŸ§€ contigs-db
πŸ• hmm-hits
πŸ”₯ anvi-search-functions. Search functions in an anvi'o contigs database or genomes storage. Basically, this program searches for one or more search terms you define in functional annotations of genes in an anvi'o contigs database, and generates multiple reports. The default report simply tells you which contigs contain genes with functions matching to serach terms you used, useful for viewing in the interface. You can also request a much more comprehensive report, which gives you anything you might need to know for each hit and serach term.
πŸ§€ contigs-db genomes-storage-db
πŸ• functions-txt
πŸ”₯ anvi-setup-interacdome. Setup InteracDome data.
πŸ• interacdome-data
πŸ”₯ anvi-setup-kegg-kofams. Download and setup KEGG KOfam HMM profiles and KEGG MODULE data.
πŸ• kegg-data modules-db
πŸ”₯ anvi-setup-ncbi-cogs. Download and setup NCBI's Clusters of Orthologous Groups database.
πŸ• cogs-data
πŸ”₯ anvi-setup-pdb-database. Setup or update an offline database of representative PDB structures clustered at 95%.
πŸ• pdb-db
πŸ”₯ anvi-setup-pfams. Download and setup Pfam data from the EBI.
πŸ• pfams-data
πŸ”₯ anvi-setup-scg-taxonomy. The purpose of this program is to download necessary information from GTDB (, and set it up in such a way that your anvi'o installation is able to assign taxonomy to single-copy core genes using anvi-run-scg-taxonomy and estimate taxonomy for genomes or metagenomes using anvi-estimate-scg-taxonomy).
πŸ• scgs-taxonomy-db
πŸ”₯ anvi-setup-trna-taxonomy. The purpose of this program is to setup necessary databases for tRNA genes collected from GTDB (, genomes in your local anvi'o installation so taxonomy information for a given set of tRNA sequences can be identified using anvi-run-trna-taxonomy and made sense of via anvi-estimate-trna-taxonomy).
πŸ• trna-taxonomy-db
πŸ”₯ anvi-show-collections-and-bins. A script to display collections stored in an anvi'o profile or pan database.
πŸ§€ pan-db profile-db
πŸ”₯ anvi-show-misc-data. Show all misc data keys in all misc data tables.
πŸ§€ pan-db profile-db contigs-db
πŸ”₯ anvi-split. Split an anvi'o pan or profile database into smaller, self-contained pieces. Provide either a genomes-storage and pan database or a profile and contigs database pair, and you'll get back directories of individual projects for each bin that can be treated as smaller anvi'o projects.
πŸ§€ profile-db contigs-db genomes-storage-db pan-db collection
πŸ• split-bins
πŸ”₯ anvi-summarize. Summarizer for anvi'o pan or profile db's. Essentially, this program takes a collection id along with either a profile database and a contigs database or a pan database and a genomes storage and generates a static HTML output for what is described in a given collection. The output directory will contain almost everything any downstream analysis may need, and can be displayed using a browser without the need for an anvi'o installation. For this reason alone, reporting summary outputs as supplementary data with publications is a great idea for transparency and reproducibility.
πŸ§€ profile-db contigs-db collection pan-db genomes-storage-db
πŸ• summary
πŸ”₯ anvi-trnaseq. A program to process raw tRNA-seq dataset, which is the sequencing of tRNA transcripts in a given sample, to generate an anvi'o tRNA-seq database.
πŸ§€ trnaseq-fasta
πŸ• trnaseq-db
πŸ”₯ anvi-update-db-description. Update the description in an anvi'o database.
πŸ§€ pan-db profile-db contigs-db genomes-storage-db
πŸ”₯ anvi-update-structure-database. Add or re-run genes from an already existing structure database. All settings used to generate your database will be used in this program.
πŸ§€ contigs-db structure-db
πŸ”₯ anvi-script-add-default-collection. A script to add a 'DEFAULT' collection in an anvi'o pan or profile database with a bin named 'EVERYTHING' that describes all items available in the profile database.
πŸ§€ pan-db profile-db contigs-db
πŸ• collection bin
πŸ”₯ anvi-script-augustus-output-to-external-gene-calls. Takes in gene calls by AUGUSTUS v3.3.3, generates an anvi'o external gene calls file. It may work well with other versions of AUGUSTUS, too. It is just no one has tested the script with different versions of the program.
πŸ§€ augustus-gene-calls
πŸ• external-gene-calls
πŸ”₯ anvi-script-calculate-pn-ps-ratio. This program calculates for each gene the ratio of pN/pS (the metagenomic analogy of dN/dS) based on metagenomic read recruitment, however, unlike standard pN/pS calculations, it relies on codons rather than nucleotides for accurate estimations of synonimity.
πŸ§€ contigs-db variability-profile-txt
πŸ• pn-ps-data
πŸ”₯ anvi-script-compute-ani-for-fasta. Run ANI between contigs in a single FASTA file.
πŸ§€ fasta
πŸ• genome-similarity
πŸ”₯ anvi-script-filter-fasta-by-blast. Filter FASTA file according to BLAST table (remove sequences with bad BLAST alignment).
πŸ§€ contigs-fasta blast-table
πŸ• contigs-fasta
πŸ”₯ anvi-script-fix-homopolymer-indels. Corrects homopolymer-region associated INDELs in a given genome based on a reference genome. The most effective use of this script is when the input genome is a genome reconstructed by minION long reads, and the reference genome is one that is of high-quality. Essentially, this script will BLAST the genome you wish to correct against the reference genome you provide, identify INDELs in the BLAST results that are exclusively associated with homopolymer regions, and will take the reference genome as a guide to correct the input sequences, and report a new FASTA file. You can use the output FASTA file that is fixed as the input FASTA file over and over again to see if you can eliminate all homopolymer-associated INDELs.
πŸ§€ fasta
πŸ• fasta
πŸ”₯ anvi-script-gen-distribution-of-genes-in-a-bin. Quantify the detection of genes in genomes in metagenomes to identify the environmental core. This is a helper script for anvi'o metapangenomic workflow.
πŸ§€ contigs-db profile-db collection bin
πŸ• view-data misc-data-items-txt
πŸ”₯ anvi-script-gen-hmm-hits-matrix-across-genomes. A simple script to generate a TAB-delimited file that reports the frequency of HMM hits for a given HMM source across contigs databases.
πŸ§€ external-genomes internal-genomes hmm-source hmm-hits
πŸ• hmm-hits-matrix-txt
πŸ”₯ anvi-script-gen-pseudo-paired-reads-from-fastq. A script that takes a FASTQ file that is not paired-end (i.e., R1 alone) and converts it into two FASTQ files that are paired-end (i.e., R1 and R2). This is a quick-and-dirty workaround that halves each read from the original FASTQ and puts one half in the FASTQ file for R1 and puts the reverse-complement of the second half in the FASTQ file for R2. If you've ended up here, things have clearly not gone very well for you, and Evan, who battled similar battles and ended up implementing this solution wholeheartedly sympathizes.
πŸ§€ short-reads-fasta
πŸ• short-reads-fasta
πŸ”₯ anvi-script-gen-short-reads. Generate short reads from contigs. Useful to reconstruct mock data sets from already assembled contigs.
πŸ§€ configuration-ini
πŸ• short-reads-fasta
πŸ”₯ A simple script to generate info from search tables, given a contigs-db.
πŸ• genes-stats
πŸ”₯ anvi-script-get-coverage-from-bam. Get nucleotide-level, contig-level, or bin-level coverage values from a BAM file.
πŸ§€ bam-file collection-txt
πŸ• coverages-txt
πŸ”₯ anvi-script-get-hmm-hits-per-gene-call. A simple script to generate a TAB-delimited file gene caller IDs and their HMM hits for a given HMM source.
πŸ§€ contigs-db hmm-source hmm-hits
πŸ• functions-txt
πŸ”₯ anvi-script-get-primer-matches. You provide this program with FASTQ files for one or more samples AND one or more short sequences, and it collects reads from FASTQ files that matches to your sequences. This tool can be most powerful if you want to collect all short reads from one or more metagenomes that are downstream to a known sequence. Using the comprehensive output files you can analyze the diversity of seuqences visually, manually, or using established strategies such as oligotyping..
πŸ§€ samples-txt
πŸ• short-reads-fasta
πŸ”₯ anvi-script-merge-collections. Generate an additional data file from multiple collections.
πŸ§€ contigs-db collection-txt
πŸ”₯ anvi-script-pfam-accessions-to-hmms-directory. You give this program one or more PFAM accession ids, and it generates an anvi'o compatible HMM directory to be used with anvi-run-hmms.
πŸ• hmm-source
πŸ”₯ anvi-script-process-genbank. This script takes a GenBank file, and outputs a FASTA file, as well as two additional TAB-delimited output files for external gene calls and gene functions that can be used with the programs anvi-gen-contigs-database and anvi-import-functions.
πŸ§€ genbank-file
πŸ• contigs-fasta external-gene-calls functions-txt
πŸ”₯ anvi-script-process-genbank-metadata. This script takes the 'metadata' output of the program ncbi-genome-download (see for details), and processes each GenBank file found in the metadata file to generate a FASTA file, as well as genes and functions files for each entry. Plus, it autmatically generates a FASTA TXT file descriptor for anvi'o snakemake workflows. So it is a multi-talented program like that.
πŸ• contigs-fasta functions-txt external-gene-calls
πŸ”₯ anvi-script-reformat-fasta. Reformat FASTA file (remove contigs based on length, or based on a given list of deflines, and/or generate an output with simpler names).
πŸ§€ fasta
πŸ• contigs-fasta
πŸ”₯ anvi-script-snvs-to-interactive. Take the output of anvi-gen-variability-profile, prepare an output for interactive interface.
πŸ§€ variability-profile-txt
πŸ• interactive
πŸ”₯ anvi-script-transpose-matrix. Transpose a TAB-delimited file.
πŸ§€ view-data functions-txt misc-data-items-txt misc-data-layers-txt gene-calls-txt linkmers-txt
πŸ• view-data functions-txt misc-data-items-txt misc-data-layers-txt gene-calls-txt linkmers-txt
πŸ”₯ anvi-script-variability-to-vcf. A script to convert SNV output obtained from anvi-gen-variability-profile to the standard VCF format.
πŸ§€ variability-profile-txt
πŸ• vcf