anvi-script-filter-hmm-hits-table

Filter weak HMM hits from a given contigs database using a domain hits table reported by anvi-run-hmms..

🔙 To the main page of anvi’o programs and artifacts.

Authors

Can consume

contigs-db hmm-source hmm-hits

Can provide

hmm-hits

Usage

This program allows you to remove low quality HMM alignments from a hmm-source in a contigs-db by leveraging HMM alignment parameters such as model-coverage (query-coverage) and gene-coverage (target-coverage) calculated from a hmm-hits. Briefly, the program will remove all records from an hmm-source in the hmm-hits, then import a new hmm-hits table into the contigs-db that was filtered to your specifications.

For this, you first need to have anvi-run-hmms to ask HMMER to report a domain hits table by including --domain-hits-table flag in your command:

anvi-run-hmms -c contigs-db \ -I Bacteria_71 \ --hmmer-output-dir path/to/dir --domain-hits-table

After the command above, your HMM hits will be stored in your contigs-db as usual. However, with the availability of the domain hits table, you can filter out hits from your contigs database using thresholds for model or gene coverage of each hit i.e. you can filter out hmm-hits where the profile HMM and gene align well to each other.

For example, following the command above, the command below will remove hmm-hits from your contigs-db for profile HMMs that had less than 90% coverage of the target genes:

anvi-script-filter-hmm-hits-table -c contigs-db \ --hmm-source Bacteria_71 \ --domain-hits-table path/to/dir/hmm.domtable \ --model-coverage 0.9

Some HMM profiles align multiple times to the same gene at different coordinates. The program anvi-script-filter-hmm-hits-table by default will use only one of those domain hits table records which could represent very little alignment coverage. To combine the domain hits table records into one hit and thus increasing alignment coverage, use the parameter --merge-partial-hits-within-X-nts. Briefly, if you give the parameter --merge-partial-hits-within-X-nts 300, anvi-script-filter-hmm-hits-table will merge all hits to the same gene in the domain hits table that have coordinates within 300 nucleotides of each other.

anvi-script-filter-hmm-hits-table -c contigs-db \ --hmm-source Bacteria_71 \ --domain-hits-table path/to/dir/hmm.domtable \ --model-coverage 0.9 \ --merge-partial-hits-within-X-nts

The input domtblout file for anvi-script-filter-hmm-hits-table will be saved as hmm.domtable.orig and the output, filtered version will be saved as hmm.domtable. If you decide to change the coverage filtering threshold or --merge-partial-hits-within-X-nts, be sure to change the path for --domain-hits-table to hmm.domtable.orig.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.