A program to compute genes databases for a ginen set of bins stored in an anvi'o collection. Genes databases store gene-level coverage and detection statistics, and they are usually computed and generated automatically when they are required (such as running anvi-interactive with --gene-mode flag). This program allows you to pre-compute them if you don't want them to be done all at once.

Can consume

profile-db contigs-db collection bin

Can provide



This program generates a genes-db, which stores the coverage and detection values for all of the genes in your contigs-db.

This information is usually calculated when it’s needed (for example when running anvi-interactive in genes mode), but this program lets you break this process into two steps. This way, you can easily change the parameters of anvi-interactive without having to recalculate the gene-level statistics.

Given a contigs-db and profile-db pair, as well as a collection, this program will calculate the stats for the genes in each of your bins and give each bin its own profile-db that includes this information.

For example, if a collection called GENE_COLLECTION contained the bins bin_0001, bin_0002, and bin_0003 and you ran:

anvi-gen-gene-level-stats-databases -c contigs-db \ -p profile-db \ -C collection

Then it will create a directory called GENES that contains three profile-db called GENE_COLLECTION-bin_0001.db, GENE_COLLECTION-bin_0002.db, and GENE_COLLECTION-bin_0003.db. In terms of output, this program is similar to anvi-split: each of these databases can now be treated as self-contained anvi’o projects but they also contain the gene-level information. Thus, you then could run anvi-interactive in genes mode on one of these profile databases.

You also have the option to provide a list of bin (either as a file or as a string) to anlyze instead of a single collection.

Other Parameters

You can also change the definition of an outlier nucleotide position or switch calculations to use the INSeq/Tn-Seq statistical methods.

