How to cite anvi'o like a pro

Anvi’o is an evolving software ecosystem, and its components are often described in multiple studies. Thus, while it can be annoying, the true best practice for your study may be to cite multiple publications if it benefits from multiple anvi’o features.

We know that finding the best studies to cite can be a lot of work. The purpose of this page is to offer up-to-date suggestions to help you find out how to finalize your citations regarding anvi’o. But if you are unsure, please feel free to drop us a line, or find us on

Anvi’o often uses third-party software or resources (such as HMMER, Prodigal, MCL, GTDB, or NCBI) and the platform typically guides you to cite relevant work when they are used for an anvi’o analysis. Suggestions on this page are specific to anvi’o, and do not include third-party software that you should also make sure to cite properly. Anvi’o does its best to remind you what to cite during its runtime regarding third-party software, and we implore you to be vigilant to recognize work from others that enable anvi’o to be able to help you.

We know this is difficult work and we are thankful for your attention.


Default citation

TL;DR: Cite this if you have used any of the anvi’o programs or workflows.

If you have used anvi’o for anything at all please consider citing this work as it describes the software ecosystem in general which currently sits on more than 120,000 lines of code, which means any given anvi’o program benefits from the entirety of this ecosystem:

Community-led, integrated, reproducible multi-omics with anvi'o Eren AM, Kiefl E, Shaiber A, Veseli I, Miller SE, Schechter MS, Fink I, Pan JN, Yousef M, Fogarty EC, Trigodet F, Watson AR, Esen ÖC, Moore RM, Clayssen Q, Lee MD, Kivenson V, Graham ED, Merrill BD, Karkman A, Blankenberg D, Eppley JM, Sjödin A, Scott JJ, Vázquez-Campos X, McKay LJ, McDaniel EA, Stevens SLR, Anderson RE, Fuessel J, Fernandez-Guerra A, Maignien L, Delmont TO, Willis AD Nature Microbiology, 6(1):3:6 🔗

It is important to cite this work as it includes all researchers who contributed to the evolution of the platfrom until 2021. Anyone who is using anvi’o today are benefiting from their contributions, and the best way to recognize their work is to cite this paper.

The rest of the citations on this page are specific for certain anvi’o features, and we thank you for your patience.

Microbial metabolism

TL;DR: Cite this if you have used anvi-estimate-metabolism.

The following study is the first one that formally describes the anvi’o metabolism frameowrk:

Microbes with higher metabolic independence are enriched in human gut microbiomes under stress Veseli I, Chen YT, Schechter MS, Vanni C, Fogarty EC, Watson AR, Jabri B, Blekhman R, Willis AD, Yu MK, Fernàndez-Guerra A, Füssel J, Eren AM 📚 eLife, 12(RP89862) | 🔍 Google Scholar | 🔗 doi:10.7554/eLife.89862

In a recent study mentioned this framework the following way:

(…)

Analysis of metabolic modules and enrichment. We calculated the level of completeness for a given KEGG module (Kanehisa et al. 2014; Kanehisa et al. 2017) in our genomes using the program anvi-estimate-metabolism (Veseli et al. 2023), which leveraged previous annotation of genes with KEGG orthologs (KOs). Then, the program anvi-compute-functional-enrichment (Shaiber et al. 2020) determined whether a given metabolic module was enriched in based on the output from anvi-estimate-metabolism. The URL https://anvio.org/m/anvi-estimate-metabolism serves a tutorial for this program which details the modes of usage and output file formats (…)

Questions on metabolism? You can ask

Microbial population genetics

TL;DR: Cite this if you have used anvi-gen-variability-profile, anvi-gen-fixation-index-matrix, anvi-display-structure, or anvi-get-pn-ps-ratio.

Much of the firepower in anvi’o for microbial population genetics, including the description of single-codon variants, fast characterization of single-nucleotide variants and IN/DELs, as well as linking genetic variation in the environment to predicted protein structures, is first described and used in this work:

Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution Kiefl E, Esen ÖC, Miller SE, Kroll KL, Willis AD, Rappé MS, Pan T, Eren AM 📚 Science Advances, 9(8):eabq4632 | 🔍 Google Scholar | 🔗 doi:10.1126/sciadv.abq4632

If your work benefited from any of these features, please consider also citing it.

Questions on microbial population genetics? You can ask

Functional or metabolic enrichment

TL;DR: Cite this if you have used anvi-compute-functional-enrichment-across-genomes, anvi-compute-functional-enrichment-in-pan, or anvi-compute-metabolic-enrichment.

Anvi’o includes featuers to study enrichment of functions in pangenomes or metabolic modules across genomes. The underlying logic for feature was described for the first time in this study:

Functional and genetic markers of niche partitioning among enigmatic members of the human oral microbiome Shaiber A, Willis AD, Delmont TO, Roux S, Chen L, Schmid AC, Yousef M, Watson AR, Lolans K, Esen ÖC, Lee STM, Downey N, Morrison HG, Dewhirst FE, Mark Welch JL, Eren AM Co-senior authors Genome Biology, 21:292 🔗

In a recent study, we cited this work the following way:

(…)

Functional enrichment analyses. The statistical approach for enrichment analysis is defined elsewhere (Shaiber et al. 2020), but briefly the program anvi-compute-functional-enrichment determined enrichment scores for functions (or metabolic modules) within groups of genomes by fitting a binomial generalized linear model (GLM) to the occurrence of each function (or complete metabolic module) in each group, and then computing a Rao test statistic, uncorrected p-values, and corrected q-values. We considered any function or metabolic module with a q-value less than 0.05 to be ‘enriched’ in its associated group (…)

Questions on enrichment analyses? You can ask

Pangenomics, Metapangenomics

TL;DR: Cite this one if you have used anvi-pan-genome or anvi-meta-pan-genome.

The anvi’o pangenomics and metapangenomics capabilities were first introduced in this study. If you are using anvi’o to generate pangenomes and/or investigate how to bring together pangenomes and metagenomes, please consider citing this work as well.

Questions on metapangenomics? You can ask

Single-amino acid variants

If you are using anvi’o to study microbial population genetics through single-codon or single-amino acid variants, please consider also citing this work:

Single-amino acid variants reveal evolutionary processes that shape the biogeography of a global SAR11 subclade Delmont TO, Kiefl E, Kilinc O, Esen ÖC, Uysal I, Rappé MS, Giovannoni S, Eren AM Co-first authors eLife, 8:e46497 🔗

Questions on single-amino acid variants? You can ask

Snakemake workflows

There is no standalone publication that describes the anvi’o snakemake workflows, although they were first introduced in the following work:

Functional and genetic markers of niche partitioning among enigmatic members of the human oral microbiome Shaiber A, Willis AD, Delmont TO, Roux S, Chen L, Schmid AC, Yousef M, Watson AR, Lolans K, Esen ÖC, Lee STM, Downey N, Morrison HG, Dewhirst FE, Mark Welch JL, Eren AM Co-senior authors Genome Biology, 21:292 🔗

In a recent study, we cited our workflows the following way:

(…)

‘Omics workflows. Whenever applicable, we automated and scaled our ‘omics analyses using the bioinformatics workflows implemented by the program anvi-run-workflow (Shaiber et al. 2020) in anvi’o (Eren et al. 2021). Anvi’o workflows implement numerous steps of bioinformatics tasks including short-read quality filtering, assembly, gene calling, functional annotation, hidden Markov model search, metagenomic read-recruitment, metagenomic binning, pangenomics, and phylogenomics. Workflows use Snakemake (Köster and Rahmann 2012) and a tutorial is available at the URL http://merenlab.org/anvio-workflows/. The following sections detail these steps.

(…)

But please consider mentioning the specific workflow you’re using in your methods section, and giving a direct link to its help page listed here.

Questions on anvi’o snakemake workflows? You can ask

Metagenomic binning, genome refinement

TL;DR: Cite this if you have used anvi-refine.

If you used anvi’o for metagenomic binning or for the refinement of genomes, please consider citing this study, too:

Anvi’o: an advanced analysis and visualization platform for ‘omics data Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, Delmont TO PeerJ, 3:e1319 🔗

Questions on binning practices? You can ask