variability-profile-txt [artifact]

TXT

A TXT-type anvi’o artifact. This artifact can be generated, used, and/or exported by anvi’o. It can also be provided by the user for anvi’o to import into its databases, process, and/or use.

Back to the main page of anvi’o programs and artifacts.

Provided by

anvi-gen-variability-profile

Required or used by

anvi-display-structure anvi-gen-fixation-index-matrix anvi-script-calculate-pn-ps-ratio anvi-script-snvs-to-interactive anvi-script-variability-to-vcf

Description

This artifact contains various information about the SNVs, SCVs, and SAAVs across a profile-db that is thoroughly described on this blogpost.

This is generated by anvi-gen-variability-profile, which is also described in that blogpost.

Unsure what SNV, SCV, and SAAVs are or looking for a refresher? You can find that information on the same blogpost.

In summary, go to the blogpost. Because the blogpost preceded this document by 5 years, most of the pertinent information that should be in here is actually over there. One day we will remedy this situation. Until then, this document serves as a quick reference for content more verbosely explained in the blog post.

variability-profile-txt is the output matrix for your SNVs, SCVs, or SAAVs. What you do with your variability-profile-txt is entirely up to your discretion. We maintain the stance that this output should be as raw as possible, so that you can analyze it how you please. Attached to each SNV, SCV, and SAAV is a plethora of annotated information.

What kinds of information?

SNVs

For each of your SNVs, this matrix include their position in the contig and gene, sample, coverage data, the A, C, G, and T counts, the reference and consensus nucleotides, entropy value, and more.

SCVs

This information will only appear if you requested it when running your earlier analysis. To do this, use the flag --profile-SCVs when you run anvi-profile. Then, when running anvi-gen-variability-profile use the flag --engine CDN.

For each SCVs, this matrix details the position, sample, coverage data, count for each of the 64 codons (AAA, AAC, …, TTG, TTT), entropy, synonymity, etc.

SAAVs

Like the information about SCVs, this information will only appear if you requested it when running your earlier analysis. To do this, use the flag --profile-SCVs when you run anvi-profile or anvi-merge. Then, when running anvi-gen-variability-profile use the flag --engine AA.

For each SCVs, this matrix details the position, sample, coverage data, count for each of the 20 amino acids (as well as the stop codon), entropy, BLOSUM62, etc.

Structural information

If you provided anvi-gen-variability-profile with a structure-db, then you’ll also have some additional columns to your matrices. These include structural annotations, the residue’s solvent accessibility, information about bond angles, and a list of residues that are in physical contact with the residue you’re looking at.

For more information on any of this, check out this page, where every column in these matrices is not only listed, but explained.

Additional amino acid and nucleotide data

If you provided anvi-gen-variability-profile with the flag --include-additional-data and you have any misc-data-amino-acids data stored in your contigs-db, that data will added as additional columns to the matrix.

This is currently only implemented for --engine AA and --engine CDN. --include-additional-data will not currently append misc-data-nucleotides data to your matrix output when --engine NT is used.

Edit this file to update this information.