workflow-config

JSON

A JSON-type anvi’o artifact. This artifact is typically provided by the user for anvi’o to import into its databases, process, and/or use.

🔙 To the main page of anvi’o programs and artifacts.

Provided by

There are no anvi’o tools that generate this artifact, which means it is most likely provided to the anvi’o ecosystem by the user.

Required by

anvi-run-workflow

Can be used by

anvi-migrate

Description

A JSON-formatted configuration file that describes steps and parameters to be considered by an anvio workflow.

You can create a default config file for a given workflow using the following command:

anvi-run-workflow --workflow ANVIO-WORKFLOW \
                  --get-default-config CONFIG.json

Following this, the file CONFIG.json will contain all configurable flags and parameters set to their default value for that workflow. From there, you can edit this file to your hearts content.

What’s in this file?

The config file contains three types of information:

  1. General parameters, including the name of the workflow, the version of this config file, and links to the fasta-txt or samples-txt file)
  2. Rule specific parameters which allow you to set the parameters on individual anvi’o programs that are run in the workflow.
  3. Output directory names which just tell anvi’o what to name all of the intermediate and final outputs (to help keep things organized).

The LOGS_DIR entry controls the base workflow log name. Workflow logs are written under 00_LOGS, within a subdirectory for the workflow or named run. During a workflow run, rule logs are organized under this directory by rule name, and a tab-delimited manifest named <workflow-name>-workflow-manifest.tsv is written there as well. For example, LOGS_DIR values such as 00_LOGS, 00_LOGS_PHYLO, 00_LOGS_FIVE_PAN, or 00_LOGS-idba_ud become directories such as 00_LOGS/phylogenomics, 00_LOGS/phylogenomics, 00_LOGS/pangenomics, or 00_LOGS/idba_ud.

For example, the default config file for the contigs workflow has no rule specific parameters and looks like this:

{
    "workflow_name": "contigs",
    "config_version": 1,
    "fasta_txt": "fasta.txt",
    "output_dirs": {
        "FASTA_DIR":   "01_FASTA_contigs_workflow",
        "CONTIGS_DIR": "02_CONTIGS_contigs_workflow",
        "LOGS_DIR":    "00_LOGS_contigs_workflow"
    }
}

On the other hand, the default config file for the contigs workflow is much longer, because it has sections for each rule specific parameter. For example, its section on parameters for the program anvi-gen-contigs-database looks like this:

"anvi_gen_contigs_database": {
   "--project-name": "{group}",
   "threads": 5,
   "--description": "",
   "--skip-gene-calling": "",
   "--ignore-internal-stop-codons": "",
   "--skip-mindful-splitting": "",
   "--contigs-fasta": "",
   "--split-length": "",
   "--kmer-size": ""
},

Note that the empty string "" here means that the default parameter for the program anvi-gen-contigs-database will be used.

For more details on the anvi’o snakemake workflows, please refer to this tutorial.

Edit this file to update this information.