anvi-script-filter-fasta-by-blast [program]

Filter FASTA file according to BLAST table (remove sequences with bad BLAST alignment).

Go back to the main page of anvi’o programs and artifacts.

Can provide

contigs-fasta

Can consume

contigs-fasta blast-table

Usage

This program takes a contigs-fasta and blast-table and removes sequences without BLAST hits of a certain level of confidence.

For example, you could use this program to filter out sequences that do not have high-confidence taxonomy assignments before running a phylogenomic analysis.

To run this program, you’ll need to provide the contigs-fasta that you’re planning to filter, the blast-table, a list of the column headers in your blast-table (as given to BLAST by -outfmt), and a proper_pident threshold at which to remove the sequences. This threshold will remove sequences less than the given percent of the query amino acids that were identical to the corresponding matched amino acids. Note that this diffres from the pident blast parameter because it doesn’t include unaligned regions.

For example, if you ran

anvi-script-filter-fasta-by-blast -f contigs-fasta \ -o path/to/contigs-fasta \ -b blast-table \ -s qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen \ -t 30

Then the output file would be a contigs-fasta that contains only the sequences in your input file that have a hit in your blast table with more than 30 percent of the amino acids aligned.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.