anvi-compute-functional-enrichment-across-genomes

A program that computes functional enrichment across groups of genomes..

🔙 To the main page of anvi’o programs and artifacts.

Authors

Can consume

groups-txt genomes-storage-db external-genomes internal-genomes functions

Can provide

functional-enrichment-txt

Usage

This program computes functional enrichment across groups of genomes and generates a functional-enrichment-txt file.

For its sister programs, see anvi-compute-functional-enrichment-in-pan and anvi-compute-metabolic-enrichment.

Please also see anvi-display-functions which can both calculate functional enrichment, AND provide an interactive interface to display the distribution of functions.

Functional enrichment

This program can be executed using genomes described through external-genomes, internal-genomes, and/or stored in a genomes-storage-db. In addition to specifying genome sources, you must provide a groups-txt file that declares which genome belongs to which group for the enrichment analysis.

How does it work?

  1. Aggregate functions from all sources. Gene calls in each genome are tallied according to their functional annotations from the specified annotation source.

  2. Quantify the distribution of functions in each group of genomes. This information is then processed by anvi-script-enrichment-stats to fit a Generalized Linear Model (GLM) that determines (1) the extent to which a particular functional annotation is unique to a single group and (2) the percentage of genomes in which it appears within each group. This analysis produces a functional-enrichment-txt file.

The script anvi-script-enrichment-stats was implemented by Amy Willis, and was first described in this paper.

Basic usage

You can execute this program with a single source of genomes:

anvi-compute-functional-enrichment-across-genomes -i internal-genomes \ -o functional-enrichment-txt \ -G groups-txt \ --annotation-source FUNCTION_SOURCE

or multiple sources:

anvi-compute-functional-enrichment-across-genomes -i internal-genomes\ -e external-genomes \ -G groups-txt \ -g genomes-storage-db \ -o functional-enrichment-txt \ --annotation-source FUNCTION_SOURCE

Additional Parameters

You can generate a tab-delimited matrix describing the occurrence (counts) of each function within each genome using the --functional-occurrence-table-output parameter:

anvi-compute-functional-enrichment-across-genomes -i internal-genomes \ -G groups-txt \ -o functional-enrichment-txt \ --annotation-source FUNCTION_SOURCE --functional-occurrence-table-output FUNC_OCCURRENCE.TXT

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.