metabolite-exchange-predictions

TXT

A TXT-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..

šŸ”™ To the main page of anvi’o programs and artifacts.

Provided by

anvi-predict-metabolic-exchanges

Required or used by

There are no anvi’o tools that use or require this artifact directly, which means it is most likely an end-product for the user.

Description

This page describes the output files produced by the program anvi-predict-metabolic-exchanges. Breifly, the most important outputs of this program are:

  • a file of ā€˜potentially-exchanged compounds’ (filename suffix: -potentially-exchanged-compounds.txt)
  • a file of ā€˜unique compounds’ (filename suffix: -unique-compounds.txt)

If you ran the prediction method that uses KEGG Pathway Map walks to examine chains of reactions leading to the production reactions (or away from the consumption reactions) of a given compound, then you will get an additional output file:

  • a file of evidence supporting the predictions of ā€˜potentially-exchanged compounds’ from Pathway Map walks

Example output files

All of the examples shown here were generated by running anvi-predict-metabolic-exchanges on the genomes of Baumannia cicadellinicola and Sulcia muelleri, which are known to exchange amino acids and vitamins as part of their co-symbiosis within the glassy-winged sharpshooter (Wu et al, 2006).

Potentially-exchanged compounds

Each line of this file describes a metabolite that is potentially exchanged between two organisms. For each prediction, it describes which genomes are involved, which organism is the predicted producer and consumer, and the prediction method. If the Pathway Walk prediction method was run, then the file will also contain a basic summary of the evidence supporting each prediction.

compound_id compound_name genomes produced_by consumed_by prediction_method max_reaction_chain_length max_production_chain_length production_overlap_length production_overlap_proportion production_chain_pathway_map max_consumption_chain_length consumption_overlap_length consumption_overlap_proportion consumption_chain_pathway_map
cpd00024 2-Oxoglutarate B_cica,S_muel B_cica S_muel Pathway_Map_Walk 3 2 None None 00340 1 1 1.0 00220
cpd00033 Glycine B_cica,S_muel B_cica S_muel Pathway_Map_Walk 2 1 None None 00480 1 1 1.0 00630
cpd00039 L-Lysine S_muel,B_cica S_muel B_cica Pathway_Map_Walk 8 7 None None 00300 1 1 1.0 00970

Below are the column descriptions.

Standard columns

  • compound_id: the ModelSEED ID of the potentially-exchanged compound
  • compound_name: the human-readable name of the compound
  • genomes: which genomes are involved in the potential exchange (labeled by the name set in the contigs-db)
  • produced_by: which genome(s) encode enzymes for reactions that produce the compound
  • consumed_by: which genome(s) encode enzymes for reactions that consume the compound
  • prediction_method: how did we identify this exchange? By walking over KEGG Pathway Maps or by isolating metabolites in the merged reaction network?

Supporting evidence columns

These columns summarize the ā€˜best’ evidence we could find that this compound is potentially-exchanged from KEGG Pathway Map walks. By ā€˜best’, we mean that we report the longest production chain of reactions in the ā€˜producer’ organism and the longest consumption chain of reactions in the ā€˜consumer’. Longer reaction chains indicate that the compound transfer is not isolated within either genome’s reaction network; i.e., it is part of a more extensive chain of chemical transformations. When there are two reaction chains with the same maximum length, we report the one with the smallest (real number) proportion of overlap between the two genomes, and when those values are also the same between two chains, we report the one with the smallest (real number) overlap length. If multiple Pathway Maps have chains of exactly the same length and overlap proportion and overlap length, then we report all of them. Note that a lot of these evidence columns can have None values in them, which means that the evidence did not apply to a particular case.

  • max_reaction_chain_length: The total length of the longest production and consumption reaction chains (max_production_chain_length + max_consumption_chain_length)
  • max_production_chain_length: The length of the longest production chain in the ā€˜producer’ organism
  • production_overlap_length: If the ā€˜consumer’ organism has enzymes that belong to the Pathway Map containing the production chain, then how long is the overlap between the ā€˜consumer’ organism’s production chains for the compound and the producer’s production chain? If this is None, it means the ā€˜consumer’ didn’t have any production chains for the compound and we couldn’t compute an overlap.
  • production_overlap_proportion: If there was overlap between the production chains in the producer and consumer, what proportion of reactions were found in both organisms?
  • production_chain_pathway_map: The ID(s) of the KEGG Pathway Map in which we found the ā€œbestā€ production chain. If there are multiple Pathway Maps in a comma-separated list here, it means they all included a production chain with the same (maximum) length, (minimum) overlap proportion, and (minimum) overlap length.
  • max_consumption_chain_length: The length of the longest consumption chain in the ā€˜consumer’ organism
  • consumption_overlap_length: If the ā€˜producer’ organism has enzymes that belong to the Pathway Map containing the consumption chain, then how long is the overlap between the ā€˜producer’ organism’s consumption chains for the compound and the consumer’s production chain? If this is None, it means the ā€˜producer’ didn’t have any consumption chains for the compound and we couldn’t compute an overlap.
  • consumption_overlap_proportion: If there was overlap between the consumption chains in the producer and consumer, what proportion of reactions were found in both organisms?
  • consumption_chain_pathway_map: The ID(s) of the KEGG Pathway Map in which we found the ā€œbestā€ consumption chain. If there are multiple Pathway Maps in a comma-separated list here, it means they all included a consumption chain with the same (maximum) length, (minimum) overlap proportion, and (minimum) overlap length.

If you are curious about the other Pathway Walk evidence for a given prediction, take a look at the ā€˜Evidence’ file (described in the next section).

Additional columns

Some columns can be added by using particular flags.

  • equivalent_compound_id: Shows the ModelSEED ID of any compound deemed equivalent to the reported compound_id. Added by running the program with either --use-equivalent-amino-acids or --custom-equivalent-compounds-file.
  • production_rxn_ids_* and consumption_rxn_ids_*: The ModelSEED ID of any production/consumption reactions the compound participates in, specific to the reaction network of a given genome. Added by using the --add-reactions-to-output flag.
  • production_rxn_eqs_* and consumption_rxn_eqs_*: The chemical reaction equation of any production/consumption reactions the compound participates in, specific to the reaction network of a given genome. Added by using the --add-reactions-to-output flag.

Pathway Walk evidence for potentially-exchanged compounds

Each line of this file describes the reaction chain evidence for a compound from one KEGG Pathway Map, in one organism. In each case, we report the longest reaction chain that we could find (rather than reporting all of the many possible chains). If a compound is present in multiple Pathway Maps, each map gets its own line.

compound compound_name longest_chain_compound_names longest_chain_compounds longest_chain_reactions longest_reaction_chain_length maximum_overlap organism pathway_map proportion_overlap type
cpd00024 2-Oxoglutarate Ā  Ā  Ā  None None B_cica 00350 None production
cpd00024 2-Oxoglutarate Ā  Ā  Ā  None None S_muel 00350 None consumption
cpd00065 L-Tryptophan L-Tryptophan,L-Tryptophanyl-tRNA(Trp) C00078,C03512 rn:R03664 1 1 B_cica 00970 1.0 consumption
cpd00383 Shikimate Ā  Ā  Ā  4 None B_cica 00400 None production
cpd00383 Shikimate Shikimate,3-phosphoshikimate,5-O–1-Carboxyvinyl-3-phosphoshikimate,Chorismate,Prephenate,Phenylpyruvate,L-Phenylalanine C00493,C03175,C01269,C00251,C00254,C00166,C00079 rn:R02412,rn:R03460,rn:R01714,rn:R01715,rn:R01373,rn:R00694 6 3 S_muel 00400 0.5 consumption
cpd03607 4-(Phosphonooxy)-threonine Ā  Ā  Ā  2 None B_cica 00750 None production
cpd03607 4-(Phosphonooxy)-threonine 4-(Phosphonooxy)-threonine,4-Hydroxy-L-threonine,Pyridoxol,Glycolaldehyde C06055,C06056,C00314,C00266 rn:R05086,rn:R01913,rn:R05840 3 0 S_muel 00750 None consumption
  • compound: the ModelSEED ID of the potentially-exchanged compound
  • compound_name: the human-readable name of the compound
  • longest_chain_compound_names: a comma-separated list of all (human-readable) compound names in this reaction chain
  • longest_chain_compounds: a comma-separated list of all ModelSEED compound IDs in this reaction chain
  • longest_chain_reactions: a comma-separated list of all ModelSEED reaction IDs in this reaction chain
  • longest_reaction_chain_length: the length of (number of reactions in) this reaction chain
  • maximum_overlap: the largest number of overlapping reactions between this reaction chain and similar chains in the other organism
  • organism: in which genome was this reaction chain found?
  • pathway_map: in which KEGG Pathway Map was this reaction chain found?
  • proportion_overlap: the maximum proportion of reactions that overlap between this reaction chain and similar chains in the other organism
  • type: whether or not this is a ā€˜production’ chain or a ā€˜consumption’ chain for the compound

Unique compounds

Each line of this file describes a metabolite that is found in only one of the organisms’ reaction networks.

compound_id compound_name genomes produced_by consumed_by prediction_method
cpd00002 ATP B_cica B_cica B_cica Pathway_Map_Walk
cpd00003 NAD B_cica B_cica B_cica Pathway_Map_Walk
cpd00005 NADPH B_cica B_cica None Pathway_Map_Walk

It uses the same standard columns and additional columns as the output for potentially-exchanged compounds.

Compounds with no prediction

This is an optional output file, that is only generated when using the flag --report-compounds-with-no-prediction. Each line of this file describes a metabolite that was not predicted to be either potentially-exchanged or unique between the input pair of genomes. Rather, these are compounds that are produced by both organisms and consumed by neither, consumed by both organisms and produced by neither, or produced and consumed by both organisms.

compound_id compound_name genomes produced_by consumed_by prediction_method
cpd00227 L-Homoserine B_cica,S_muel B_cica,S_muel B_cica,S_muel Pathway_Map_Walk
cpd19009 alpha-D-Mannose None None None Pathway_Map_Walk
cpd29753 C15811 B_cica,S_muel None B_cica,S_muel Reaction_Network_Subset
cpd00190 beta-D-Glucose B_cica,S_muel B_cica,S_muel None Reaction_Network_Subset

It uses the same standard columns and some additional columns as the output for potentially-exchanged compounds.

Edit this file to update this information.