A TXT-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..
🔙 To the main page of anvi’o programs and artifacts.
There are no anvi’o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
A TAB-delimited file of palindromic sequences reported by anvi-search-palindromes.
The following example is the output generated by the command below when it was run on contigs-db of the Infant Gut Dataset:
anvi-search-palindromes -c CONTIGS.db \ --min-palindrome-length 50 \ --max-num-mismatches 1 \ --output-file palindromes.txt
sequence_name | length | distance | num_mismatches | first_start | first_end | first_sequence | second_start | second_end | second_sequence | midline |
---|---|---|---|---|---|---|---|---|---|---|
Day17a_QCcontig1 | 48 | 0 | 0 | 195100 | 195148 | AAGAGAAGAGGAGAAGTTCATCCATGGATGAACTTCTCCTCTTCTCTT | 195100 | 195148 | AAGAGAAGAGGAGAAGTTCATCCATGGATGAACTTCTCCTCTTCTCTT | |||||||||||||||||||||||||||||||||||||||||||||||| |
Day17a_QCcontig4 | 147 | 759 | 1 | 268872 | 269019 | TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAAGCTAGAAAAA | 269631 | 269778 | TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAAACTAGAAAAA | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||x||||||||| |
Day17a_QCcontig4 | 53 | 1956 | 1 | 268237 | 268290 | CAGCTGCTTTTGTCAAAAGCACATAGGAATTTCACCTCTCCCCAAGTTTACGG | 270193 | 270246 | CAGCTGCTTTTGTCAAAAGCACATAGGAATTTCACCTCTCTCCAAGTTTACGG | ||||||||||||||||||||||||||||||||||||||||x|||||||||||| |
Day17a_QCcontig4 | 66 | 1956 | 1 | 268325 | 268391 | ATCATCACTTTTTATTGACTATAAAAATTATTTTAGAATATTTATCGCTCCTTCTTTACGATAAGA | 270281 | 270347 | ATCATCACTTTTTATTGACTATAAAAATTATTTTAGAATGTTTATCGCTCCTTCTTTACGATAAGA | |||||||||||||||||||||||||||||||||||||||x|||||||||||||||||||||||||| |
Day17a_QCcontig4 | 60 | 98694 | 1 | 16368 | 16428 | AGAACAATTTTCGGAAATTCCTTCTTATTTCTCGGAGTTAAACGCTTCTGTCCCGACCTC | 115062 | 115122 | AGAACAATTTTCGGAAATTCCTTCTTATTTCTCGGAGTTAAACACTTCTGTCCCGACCTC | |||||||||||||||||||||||||||||||||||||||||||x|||||||||||||||| |
Day17a_QCcontig16 | 42 | 0 | 0 | 105735 | 105777 | AAAAAGAACGCTCTTTTGCTTAAGCAAAAGAGCGTTCTTTTT | 105735 | 105777 | AAAAAGAACGCTCTTTTGCTTAAGCAAAAGAGCGTTCTTTTT | |||||||||||||||||||||||||||||||||||||||||| |
Day17a_QCcontig23 | 50 | 0 | 0 | 51287 | 51337 | ATAAATAAACAGAGGCCTTAGAAATATTTCTAAGGCCTCTGTTTATTTAT | 51287 | 51337 | ATAAATAAACAGAGGCCTTAGAAATATTTCTAAGGCCTCTGTTTATTTAT | |||||||||||||||||||||||||||||||||||||||||||||||||| |
In which,
sequence_name
is the sequence name on which a given palindrome was found.length
is the length of the palindrome.distance
is the number of nucleotides between the location of the palindromic sequences in the larger seqeunce.num_mismatches
is the number of actual nucleotides in the palindrome sequence that did not match to its counterpart when the sequence was reverse-complemented.first_start
is the start position of the first palindrome in the reference sequence.first_end
is the end position of the first palindrome.second_start
and second_end
are just like first_start
and first_end
but for the second sequence. For perfect palindromes (i.e., palindromes with zero distance), these values will be identical to their counterparts in the first sequence.first_sequence
and second_sequence
are the actual nucleotide sequences of both. They will be identical if number of mismatches are zero. Please note that only the reverse complement of the second_sequence
will be found in the reference sequnce.midline
an array of letters that are composed of |
and x
characters that show where the matching and mismatching nucleotides were (if any).Please note that the sequence_name
column may not have unique sequence names if multiple palindromes found on the same sequence (which almost certainly be the case for most searches on circular genomes).
Please also note that the start
and end
positions are 0-indexed, which means (1) the first nucleotide in the sequence should be counted as the zeroth element, and (2) if you do this in Python using the example above, you will get the matching palindrome from the larger sequence context:
contig_sequences[Day17a_QCcontig1][195100: 195148]
>>> AAGAGAAGAGGAGAAGTTCATCCATGGATGAACTTCTCCTCTTCTCTT
Edit this file to update this information.