anvi-compute-gene-cluster-homogeneity

Compute homogeneity for gene clusters.

🔙 To the main page of anvi’o programs and artifacts.

Authors

Can consume

pan-db genomes-storage-db

Can provide

This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi’o artifacts without producing any immediate outputs.

Usage

This program computes both the geometric homogeneity and functional homogeneity for the gene clusters in a pan-db.

Geometric homogeneity and functional homogeneity are anvi’o specific terms that describe how similar genes within a gene cluster are to each other in different ways. Briefly, geometric homogeneity compares the positions of gaps in the aligned residues without considering specific amino acids, and functional homogeneity examines point mutations to amino acids and compares how similar the resulting amino acids are chemically. See this page for more details.

You can run this program as so:

anvi-compute-gene-cluster-homogeneity -p pan-db \ -g genomes-storage-db \ -o path/to/output.txt \ --store-in-db

This run will put the output directly in the database, as well as provide it as a separate file as the specified output path.

You also have the option to calculate this information about only specific gene clusters, either by providing a gene cluster ID, list of gene cluster IDs, collection or bin.

To save on runtime, you can also enable --quick-homogeneity, which will not check for horizontal geometric homogenity (i.e. it will not look at alignments within a single gene). This will be less accurate for detailed analyses, but it will run faster.

Here is an example run that uses this flag and only looks at a specific collection:

anvi-compute-gene-cluster-homogeneity -p pan-db \ -g genomes-storage-db \ -o path/to/output.txt \ --store-in-db \ -C collection \ --quick-homogeneity

You can also use multithreading if you’re familiar with that.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.