Run KOfam HMMs on an anvi'o contigs database.
Go back to the main page of anvi’o programs and artifacts.
Essentially, this program uses the KEGG database to annotate functions and metabolic pathways in a contigs-db. More specifically, anvi-run-kegg-kofams annotates a contigs-db with HMM hits from KOfam, a database of KEGG Orthologs (KOs). You must set up these HMMs on your computer using anvi-setup-kegg-kofams before you can use this program.
Briefly, what this program does is extract all the gene calls from the contigs-db and checks each one for hits to the KOfam HMM profiles in your kegg-data. This can be time-consuming given that the number of HMM profiles is quite large, even more so if the number of genes in the contigs-db is also large. Multi-threading is a good idea if you have the computational capability to do so.
Many HMM hits will be found, most of them weak. The weak hits will by default be eliminated according to the score thresholds provided by KEGG; that is, only hits with scores above the threshold for a given KO profile will be annotated in the contigs-db. It is perfectly normal to notice that the number of raw hits found is many, many times larger than the number of annotated KO hits in your database.
In the contigs-db functions table, annotated KO hits (kegg-functions) will have the source KOfam
.
Running this program is a pre-requisite for metabolism estimation with anvi-estimate-metabolism. Note that if you are planning to run metabolism estimation, it must be run with the same kegg-data that is used in this program to annotate KOfam hits.
anvi-run-kegg-kofams -c CONTIGS.db
If you have previously setup your KEGG data directory using --kegg-data-dir
(see anvi-setup-kegg-kofams), or have moved the KEGG data directory that you wish to use to a non-default location (maybe you like keeping the older versions around when you update, we don’t know how you roll), then you may need to specify where to find the KEGG data so that this program can use the right one. In that case, this is how you do it:
anvi-run-kegg-kofams -c CONTIGS.db --kegg-data-dir /path/to/directory/KEGG
anvi-run-kegg-kofams -c CONTIGS.db -T 4
By default, anvi-run-kegg-kofams uses hmmsearch
to find KO hits. If for some reason you would rather use a different program (hmmscan
is also currently supported), you can do so.
anvi-run-kegg-kofams -c CONTIGS.db --hmmer-program hmmscan
Usually, this program parses out weak HMM hits and keeps only those that are above the score threshold for a given KO. If you would like to turn off this behavior and keep all hits (there will be a lot of weak ones), you can follow the example below:
anvi-run-kegg-kofams -c CONTIGS.db --keep-all-hits
Edit this file to update this information.
Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__
tag in this file to see an example.