metabolite-exchange-predictions

Provided by
Required or used by
Description
Example output files

Potentially-exchanged compounds
Pathway Walk evidence for potentially-exchanged compounds
Unique compounds
Compounds with no prediction

TXT

A TXT-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..

🔙 To the main page of anvi’o programs and artifacts.

Provided by

anvi-predict-metabolic-exchanges

Required or used by

There are no anvi’o tools that use or require this artifact directly, which means it is most likely an end-product for the user.

Description

This page describes the output files produced by the program anvi-predict-metabolic-exchanges. Breifly, the most important outputs of this program are:

a file of ‘potentially-exchanged compounds’ (filename suffix: -potentially-exchanged-compounds.txt)
a file of ‘unique compounds’ (filename suffix: -unique-compounds.txt)

If you ran the prediction method that uses KEGG Pathway Map walks to examine chains of reactions leading to the production reactions (or away from the consumption reactions) of a given compound, then you will get an additional output file:

a file of evidence supporting the predictions of ‘potentially-exchanged compounds’ from Pathway Map walks

Example output files

All of the examples shown here were generated by running anvi-predict-metabolic-exchanges on the genomes of Baumannia cicadellinicola and Sulcia muelleri, which are known to exchange amino acids and vitamins as part of their co-symbiosis within the glassy-winged sharpshooter (Wu et al, 2006).

Potentially-exchanged compounds

Each line of this file describes a metabolite that is potentially exchanged between two organisms. For each prediction, it describes which genomes are involved, which organism is the predicted producer and consumer, and the prediction method. If the Pathway Walk prediction method was run, then the file will also contain a basic summary of the evidence supporting each prediction.

`compound_id`	`compound_name`	`genomes`	`produced_by`	`consumed_by`	`prediction_method`	`max_reaction_chain_length`	`max_production_chain_length`	`production_overlap_length`	`production_overlap_proportion`	`production_chain_pathway_map`	`max_consumption_chain_length`	`consumption_overlap_length`	`consumption_overlap_proportion`	`consumption_chain_pathway_map`
cpd00024	2-Oxoglutarate	B_cica,S_muel	B_cica	S_muel	Pathway_Map_Walk	3	2	None	None	00340	1	1	1.0	00220
cpd00033	Glycine	B_cica,S_muel	B_cica	S_muel	Pathway_Map_Walk	2	1	None	None	00480	1	1	1.0	00630
cpd00039	L-Lysine	S_muel,B_cica	S_muel	B_cica	Pathway_Map_Walk	8	7	None	None	00300	1	1	1.0	00970

Below are the column descriptions.

Standard columns

compound_id: the ModelSEED ID of the potentially-exchanged compound
compound_name: the human-readable name of the compound
genomes: which genomes are involved in the potential exchange (labeled by the name set in the contigs-db)
produced_by: which genome(s) encode enzymes for reactions that produce the compound
consumed_by: which genome(s) encode enzymes for reactions that consume the compound
prediction_method: how did we identify this exchange? By walking over KEGG Pathway Maps or by isolating metabolites in the merged reaction network?

Supporting evidence columns

These columns summarize the ‘best’ evidence we could find that this compound is potentially-exchanged from KEGG Pathway Map walks. By ‘best’, we mean that we report the longest production chain of reactions in the ‘producer’ organism and the longest consumption chain of reactions in the ‘consumer’. Longer reaction chains indicate that the compound transfer is not isolated within either genome’s reaction network; i.e., it is part of a more extensive chain of chemical transformations. When there are two reaction chains with the same maximum length, we report the one with the smallest (real number) proportion of overlap between the two genomes, and when those values are also the same between two chains, we report the one with the smallest (real number) overlap length. If multiple Pathway Maps have chains of exactly the same length and overlap proportion and overlap length, then we report all of them. Note that a lot of these evidence columns can have None values in them, which means that the evidence did not apply to a particular case.

max_reaction_chain_length: The total length of the longest production and consumption reaction chains (max_production_chain_length + max_consumption_chain_length)
max_production_chain_length: The length of the longest production chain in the ‘producer’ organism
production_overlap_length: If the ‘consumer’ organism has enzymes that belong to the Pathway Map containing the production chain, then how long is the overlap between the ‘consumer’ organism’s production chains for the compound and the producer’s production chain? If this is None, it means the ‘consumer’ didn’t have any production chains for the compound and we couldn’t compute an overlap.
production_overlap_proportion: If there was overlap between the production chains in the producer and consumer, what proportion of reactions were found in both organisms?
production_chain_pathway_map: The ID(s) of the KEGG Pathway Map in which we found the “best” production chain. If there are multiple Pathway Maps in a comma-separated list here, it means they all included a production chain with the same (maximum) length, (minimum) overlap proportion, and (minimum) overlap length.
max_consumption_chain_length: The length of the longest consumption chain in the ‘consumer’ organism
consumption_overlap_length: If the ‘producer’ organism has enzymes that belong to the Pathway Map containing the consumption chain, then how long is the overlap between the ‘producer’ organism’s consumption chains for the compound and the consumer’s production chain? If this is None, it means the ‘producer’ didn’t have any consumption chains for the compound and we couldn’t compute an overlap.
consumption_overlap_proportion: If there was overlap between the consumption chains in the producer and consumer, what proportion of reactions were found in both organisms?
consumption_chain_pathway_map: The ID(s) of the KEGG Pathway Map in which we found the “best” consumption chain. If there are multiple Pathway Maps in a comma-separated list here, it means they all included a consumption chain with the same (maximum) length, (minimum) overlap proportion, and (minimum) overlap length.

If you are curious about the other Pathway Walk evidence for a given prediction, take a look at the ‘Evidence’ file (described in the next section).

Additional columns

Some columns can be added by using particular flags.

equivalent_compound_id: Shows the ModelSEED ID of any compound deemed equivalent to the reported compound_id. Added by running the program with either --use-equivalent-amino-acids or --custom-equivalent-compounds-file.
production_rxn_ids_* and consumption_rxn_ids_*: The ModelSEED ID of any production/consumption reactions the compound participates in, specific to the reaction network of a given genome. Added by using the --add-reactions-to-output flag.
production_rxn_eqs_* and consumption_rxn_eqs_*: The chemical reaction equation of any production/consumption reactions the compound participates in, specific to the reaction network of a given genome. Added by using the --add-reactions-to-output flag.

Pathway Walk evidence for potentially-exchanged compounds

Each line of this file describes the reaction chain evidence for a compound from one KEGG Pathway Map, in one organism. In each case, we report the longest reaction chain that we could find (rather than reporting all of the many possible chains). If a compound is present in multiple Pathway Maps, each map gets its own line.

`compound`	`compound_name`	`longest_chain_compound_names`	`longest_chain_compounds`	`longest_chain_reactions`	`longest_reaction_chain_length`	`maximum_overlap`	`organism`	`pathway_map`	`proportion_overlap`	`type`
cpd00024	2-Oxoglutarate				None	None	B_cica	00350	None	production
cpd00024	2-Oxoglutarate				None	None	S_muel	00350	None	consumption
cpd00065	L-Tryptophan	L-Tryptophan,L-Tryptophanyl-tRNA(Trp)	C00078,C03512	rn:R03664	1	1	B_cica	00970	1.0	consumption
cpd00383	Shikimate				4	None	B_cica	00400	None	production
cpd00383	Shikimate	Shikimate,3-phosphoshikimate,5-O–1-Carboxyvinyl-3-phosphoshikimate,Chorismate,Prephenate,Phenylpyruvate,L-Phenylalanine	C00493,C03175,C01269,C00251,C00254,C00166,C00079	rn:R02412,rn:R03460,rn:R01714,rn:R01715,rn:R01373,rn:R00694	6	3	S_muel	00400	0.5	consumption
cpd03607	4-(Phosphonooxy)-threonine				2	None	B_cica	00750	None	production
cpd03607	4-(Phosphonooxy)-threonine	4-(Phosphonooxy)-threonine,4-Hydroxy-L-threonine,Pyridoxol,Glycolaldehyde	C06055,C06056,C00314,C00266	rn:R05086,rn:R01913,rn:R05840	3	0	S_muel	00750	None	consumption

compound: the ModelSEED ID of the potentially-exchanged compound
compound_name: the human-readable name of the compound
longest_chain_compound_names: a comma-separated list of all (human-readable) compound names in this reaction chain
longest_chain_compounds: a comma-separated list of all ModelSEED compound IDs in this reaction chain
longest_chain_reactions: a comma-separated list of all ModelSEED reaction IDs in this reaction chain
longest_reaction_chain_length: the length of (number of reactions in) this reaction chain
maximum_overlap: the largest number of overlapping reactions between this reaction chain and similar chains in the other organism
organism: in which genome was this reaction chain found?
pathway_map: in which KEGG Pathway Map was this reaction chain found?
proportion_overlap: the maximum proportion of reactions that overlap between this reaction chain and similar chains in the other organism
type: whether or not this is a ‘production’ chain or a ‘consumption’ chain for the compound

Unique compounds

Each line of this file describes a metabolite that is found in only one of the organisms’ reaction networks.

`compound_id`	`compound_name`	`genomes`	`produced_by`	`consumed_by`	`prediction_method`
cpd00002	ATP	B_cica	B_cica	B_cica	Pathway_Map_Walk
cpd00003	NAD	B_cica	B_cica	B_cica	Pathway_Map_Walk
cpd00005	NADPH	B_cica	B_cica	None	Pathway_Map_Walk

It uses the same standard columns and additional columns as the output for potentially-exchanged compounds.

Compounds with no prediction

This is an optional output file, that is only generated when using the flag --report-compounds-with-no-prediction. Each line of this file describes a metabolite that was not predicted to be either potentially-exchanged or unique between the input pair of genomes. Rather, these are compounds that are produced by both organisms and consumed by neither, consumed by both organisms and produced by neither, or produced and consumed by both organisms.

`compound_id`	`compound_name`	`genomes`	`produced_by`	`consumed_by`	`prediction_method`
cpd00227	L-Homoserine	B_cica,S_muel	B_cica,S_muel	B_cica,S_muel	Pathway_Map_Walk
cpd19009	alpha-D-Mannose	None	None	None	Pathway_Map_Walk
cpd29753	C15811	B_cica,S_muel	None	B_cica,S_muel	Reaction_Network_Subset
cpd00190	beta-D-Glucose	B_cica,S_muel	B_cica,S_muel	None	Reaction_Network_Subset

It uses the same standard columns and some additional columns as the output for potentially-exchanged compounds.

Edit this file to update this information.