Creates a database of protein structures. Predict protein structures using template-based homology modelling of genes in your contigs database, or import pre-computed PDB structures you already have..
🔙 To the main page of anvi’o programs and artifacts.
This program creates a structure-db either by (a) attempting to solve for the 3D structures of proteins encoded by genes in your contigs-db using DIAMOND and MODELLER, or (b) importing pre-existing structures provided by the user using an external-structures file.
This section covers option (a), where the user is interested in having structures predicted for them.
DIAMOND first searches your sequence(s) against a database of proteins with a known structure. This database is downloaded from the Sali lab, who created and maintain MODELLER, and contains all of the PDB sequences clustered at 95% identity.
If any good hits are found, they are selected as templates, and their structures are nabbed either from the RCSB directly, or from a local pdb-db database which you can create yourself with anvi-setup-pdb-database. Then, anvi’o passes control over to MODELLER, which creates a 3D alignment for your sequence to the template structures, and makes final adjustments to it based off of empirical distributions of bond angles. For more information, check this blogpost.
The output of this program is a structure-db, which contains all of the modelled structures. Currently, the primary use of the structure-db is for interactive exploration with anvi-display-structure. You can also export your structures into external .pdb files with anvi-export-structures, or incorporate structural information in the variability-profile-txt with anvi-gen-variability-profile.
Here is a simple run:
anvi-gen-structure-database -c contigs-db \ --gene-caller-ids 1,2,3 \ -o STRUCTURE.db
Following this, you will have the structures for genes 1, 2, and 3 stored in STRUCTURE.db
, assuming reasonable templates were found. Alternatively, you can provide a file name with the gene caller IDs (one ID per line) with the flag --genes-of-interest
.
If you have already run anvi-setup-pdb-database and therefore have a local copy of representative PDB structures, make sure you use it by providing the --offline
flag. If you put it in a non-default location, provide the path to your pdb-db:
anvi-gen-structure-database -c contigs-db \ --gene-caller-ids 1,2,3 \ --pdb-database pdb-db \ -o STRUCTURE.db
To quickly get a very rough estimate for your structures, you can run with the flag --very-fast
.
If you already possess structures and would like to create a structure-db for downstream anvi’o uses such as anvi-display-structure, you should create a external-structures file. Then, create the database as follows:
anvi-gen-structure-database -c contigs-db \ --external-structures external-structures \ -o STRUCTURE.db
Please avoid using any MODELLER-specific parameters when using this mode, as they will be silently ignored.
Here, we will go through a brief overview of the MODELLER parameters that you are able to change. See this page for more information.
pdb_95
, which can be found here. This is the same database that is downloaded by anvi-setup-pdb-database.DOPE_score
.mod9.19
, but anvi’o is somewhat intelligent and will
look for the most recent version it can find.For a case study on how some of these parameters matter, see here.
You also have the option to
--dump-dir
.Edit this file to update this information.
Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__
tag in this file to see an example.