Anvi’o is an evolving software ecosystem, and its components are often described in multiple studies. Thus, while it can be annoying, the true best practice for your study may be to cite multiple publications if it benefits from multiple anvi’o features.
We know that finding the best studies to cite can be a lot of work. The purpose of this page is to offer up-to-date suggestions to help you find out how to finalize your citations regarding anvi’o. But if you are unsure, please feel free to drop us a line, or find us on
Anvi’o often uses third-party software or resources (such as HMMER, Prodigal, MCL, GTDB, or NCBI) and the platform typically guides you to cite relevant work when they are used for an anvi’o analysis. Suggestions on this page are specific to anvi’o, and do not include third-party software that you should also make sure to cite properly. Anvi’o does its best to remind you what to cite during its runtime regarding third-party software, and we implore you to be vigilant to recognize work from others that enable anvi’o to be able to help you.
We know this is difficult work and we are thankful for your attention.
TL;DR: Cite this if you have used any of the anvi’o programs or workflows.
If you have used anvi’o for anything at all please consider citing this work as it describes the software ecosystem in general which currently sits on more than 120,000 lines of code, which means any given anvi’o program benefits from the entirety of this ecosystem:
It is important to cite this work as it includes all researchers who contributed to the evolution of the platfrom until 2021. Anyone who is using anvi’o today are benefiting from their contributions, and the best way to recognize their work is to cite this paper.
The rest of the citations on this page are specific for certain anvi’o features, and we thank you for your patience.
TL;DR: Cite this if you have used anvi-estimate-metabolism.
The following study is the first one that formally describes the anvi’o metabolism frameowrk:
In a recent study mentioned this framework the following way:
(…)
Analysis of metabolic modules and enrichment. We calculated the level of completeness for a given KEGG module (Kanehisa et al. 2014; Kanehisa et al. 2017) in our genomes using the program
anvi-estimate-metabolism
(Veseli et al. 2023), which leveraged previous annotation of genes with KEGG orthologs (KOs). Then, the programanvi-compute-functional-enrichment
(Shaiber et al. 2020) determined whether a given metabolic module was enriched in based on the output fromanvi-estimate-metabolism
. The URL https://anvio.org/m/anvi-estimate-metabolism serves a tutorial for this program which details the modes of usage and output file formats (…)
Questions on metabolism? You can ask
TL;DR: Cite this if you have used anvi-gen-variability-profile, anvi-gen-fixation-index-matrix, anvi-display-structure, or anvi-get-pn-ps-ratio.
Much of the firepower in anvi’o for microbial population genetics, including the description of single-codon variants, fast characterization of single-nucleotide variants and IN/DELs, as well as linking genetic variation in the environment to predicted protein structures, is first described and used in this work:
If your work benefited from any of these features, please consider also citing it.
Questions on microbial population genetics? You can ask
TL;DR: Cite this if you have used anvi-compute-functional-enrichment-across-genomes, anvi-compute-functional-enrichment-in-pan, or anvi-compute-metabolic-enrichment.
Anvi’o includes featuers to study enrichment of functions in pangenomes or metabolic modules across genomes. The underlying logic for feature was described for the first time in this study:
In a recent study, we cited this work the following way:
(…)
Functional enrichment analyses. The statistical approach for enrichment analysis is defined elsewhere (Shaiber et al. 2020), but briefly the program
anvi-compute-functional-enrichment
determined enrichment scores for functions (or metabolic modules) within groups of genomes by fitting a binomial generalized linear model (GLM) to the occurrence of each function (or complete metabolic module) in each group, and then computing a Rao test statistic, uncorrected p-values, and corrected q-values. We considered any function or metabolic module with a q-value less than 0.05 to be ‘enriched’ in its associated group (…)
Questions on enrichment analyses? You can ask
TL;DR: Cite this one if you have used anvi-pan-genome or anvi-meta-pan-genome.
The anvi’o pangenomics and metapangenomics capabilities were first introduced in this study. If you are using anvi’o to generate pangenomes and/or investigate how to bring together pangenomes and metagenomes, please consider citing this work as well.
Questions on metapangenomics? You can ask
If you are using anvi’o to study microbial population genetics through single-codon or single-amino acid variants, please consider also citing this work:
Questions on single-amino acid variants? You can ask
There is no standalone publication that describes the anvi’o snakemake workflows, although they were first introduced in the following work:
In a recent study, we cited our workflows the following way:
(…)
‘Omics workflows. Whenever applicable, we automated and scaled our ‘omics analyses using the bioinformatics workflows implemented by the program
anvi-run-workflow
(Shaiber et al. 2020) in anvi’o (Eren et al. 2021). Anvi’o workflows implement numerous steps of bioinformatics tasks including short-read quality filtering, assembly, gene calling, functional annotation, hidden Markov model search, metagenomic read-recruitment, metagenomic binning, pangenomics, and phylogenomics. Workflows use Snakemake (Köster and Rahmann 2012) and a tutorial is available at the URL http://merenlab.org/anvio-workflows/. The following sections detail these steps.(…)
But please consider mentioning the specific workflow you’re using in your methods section, and giving a direct link to its help page listed here.
Questions on anvi’o snakemake workflows? You can ask
TL;DR: Cite this if you have used anvi-refine.
If you used anvi’o for metagenomic binning or for the refinement of genomes, please consider citing this study, too:
Questions on binning practices? You can ask