anvi-gen-gene-level-stats-databases [program]

A program to compute genes databases for a ginen set of bins stored in an anvi'o collection. Genes databases store gene-level coverage and detection statistics, and they are usually computed and generated automatically when they are required (such as running anvi-interactive with --gene-mode flag). This program allows you to pre-compute them if you don't want them to be done all at once.

Go back to the main page of anvi’o programs and artifacts.

Can provide

genes-db

Can consume

profile-db contigs-db collection bin

Usage

This program generates a genes-db, which stores the coverage and detection values for all of the genes in your contigs-db.

This information is usually calculated when it’s needed (for example when running anvi-interactive in genes mode), but this program lets you break this process into two steps. This way, you can easily change the parameters of anvi-interactive without having to recalculate the gene-level statistics.

Given a contigs-db and profile-db pair, as well as a collection, this program will calculate the stats for the genes in each of your bins and give each bin its own profile-db that includes this information.

For example, if a collection called GENE_COLLECTION contained the bins bin_0001, bin_0002, and bin_0003 and you ran:

anvi-gen-gene-level-stats-databases -c contigs-db \ -p profile-db \ -C collection

Then it will create a directory called GENES that contains three profile-db called GENE_COLLECTION-bin_0001.db, GENE_COLLECTION-bin_0002.db, and GENE_COLLECTION-bin_0003.db. In terms of output, this program is similar to anvi-split: each of these databases can now be treated as self-contained anvi’o projects but they also contain the gene-level information. Thus, you then could run anvi-interactive in genes mode on one of these profile databases.

You also have the option to provide a list of bin (either as a file or as a string) to anlyze instead of a single collection.

Other Parameters

You can also change the definition of an outlier nucleotide position or switch calculations to use the INSeq/Tn-Seq statistical methods.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.