A DB-type anviāo artifact. This artifact is typically generated, used, and/or exported by anviāo (and not provided by the user)..
š To the main page of anviāo programs and artifacts.
anvi-cluster-contigs anvi-compute-completeness anvi-db-info anvi-delete-functions anvi-delete-hmms anvi-display-contigs-stats anvi-display-metabolism anvi-display-structure anvi-estimate-genome-completeness anvi-estimate-metabolism anvi-estimate-scg-taxonomy anvi-estimate-trna-taxonomy anvi-export-contigs anvi-export-functions anvi-export-gene-calls anvi-export-gene-coverage-and-detection anvi-export-locus anvi-export-misc-data anvi-export-splits-and-coverages anvi-export-splits-taxonomy anvi-gen-fixation-index-matrix anvi-gen-gene-consensus-sequences anvi-gen-gene-level-stats-databases anvi-gen-structure-database anvi-gen-variability-profile anvi-get-aa-counts anvi-get-codon-frequencies anvi-get-pn-ps-ratio anvi-get-sequences-for-gene-calls anvi-get-sequences-for-hmm-hits anvi-get-short-reads-from-bam anvi-get-short-reads-mapping-to-a-gene anvi-get-split-coverages anvi-import-collection anvi-import-functions anvi-import-misc-data anvi-import-taxonomy-for-genes anvi-inspect anvi-interactive anvi-merge anvi-migrate anvi-profile anvi-profile-blitz anvi-refine anvi-rename-bins anvi-report-inversions anvi-run-hmms anvi-run-interacdome anvi-run-kegg-kofams anvi-run-ncbi-cogs anvi-run-pfams anvi-run-scg-taxonomy anvi-run-trna-taxonomy anvi-scan-trnas anvi-search-functions anvi-search-palindromes anvi-search-sequence-motifs anvi-show-misc-data anvi-split anvi-summarize anvi-summarize-blitz anvi-update-db-description anvi-update-structure-database anvi-script-add-default-collection anvi-script-filter-hmm-hits-table anvi-script-gen-distribution-of-genes-in-a-bin anvi-script-gen-genomes-file anvi-script-gen_stats_for_single_copy_genes.py anvi-script-get-hmm-hits-per-gene-call anvi-script-merge-collections anvi-script-permute-trnaseq-seeds
A contigs database is an anviāo database that contains key information associated with your sequences.
In a way, an anviāo contigs database is a modern, more talented form of a FASTA file, where you can store additional information about your sequences in it and others can query and use it. Information storage and access is primarily done by anviāo programs, however, it can also be done through the command line interface or programmatically.
The information a contigs database contains about its sequences can include the positions of open reading frames, tetra-nucleotide frequencies, functional and taxonomic annotations, information on individual nucleotide or amino acid positions, and more.
When working in anviāo, youāll need to be able to access previous analysis done on a genome or transcriptome. To do this, anviāo uses tools like contigs databases instead of regular fasta files. So, youāll want to convert the data that you have into a contigs database to use other anviāo programs (using anvi-gen-contigs-database). As seen on the page for metagenomes, you can then use this contigs database instead of your fasta file for all of your anviāo needs.
In short, to get the most out of your data in anviāo, youāll want to use your data (which was probably originally in a fasta file) to create both a contigs-db and a profile-db. That way, anviāo is able to keep track of many different kinds of analysis and you can easily interact with other anviāo programs.
Contigs databases will be initialized using anvi-gen-contigs-database using a contigs-fasta. This will compute the k-mer frequencies for each contig, soft-split your contigs, and identify open reading frames. To populate a contigs database with more information, you can then run various other programs.
Key programs that populate an anviāo contigs database with essential information include,
Once an anviāo contigs database is generated and populated with information, it is always a good idea to run anvi-display-contigs-stats to see a numerical summary of its contents.
Other programs you can run to populate a contigs database with functions include,
Other essential programs that read from a contigs database and yield key information include anvi-estimate-genome-completeness, anvi-get-sequences-for-hmm-hits, and anvi-estimate-scg-taxonomy.
If you wish to run programs like anvi-cluster-contigs, anvi-estimate-metabolism, and anvi-gen-gene-level-stats-databases, or view your database with anvi-interactive, youāll need to first use your contigs database to create a profile-db.
Contigs databases, like profile-dbs, are allowed have different variants, though the only currently implemented variant, the trnaseq-contigs-db, is for tRNA transcripts from tRNA-seq experiments. The default variant stored for āstandardā contigs databases is unknown
. Variants should indicate that substantially different information is stored in the database. For instance, open reading frames are applicable to protein-coding genes but not tRNA transcripts, so ORF data is not recorded for the trnaseq
variant. The $(trnaseq-workflow)s generates trnaseq-contigs-dbs using a very different approach to anvi-gen-contigs-database.
Edit this file to update this information.