Extraction of upstream and downstream regions from long sequence file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extraction of upstream and downstream regions from long sequence file
# 1  
Old 08-06-2015
Extraction of upstream and downstream regions from long sequence file

Hello, here I am posting my query again with modified data input files.
see my query is :

i have two input files file1 and file2.

file1 is smalldata.fasta

Code:
>gi|546671471|gb|AWWX01449637.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449636, whole genome shotgun sequence
TATGGATGTGAGAGTTGGACTATGAAGAAGGCTGAGCGCTGAAGAATTGATGCTTTTGAACTGTGGTGTT
GGAGAAGACTCTTGAGAGTCCCTTGGACTGCAAGGAGATCCAACCAGTCCATTCTGAAGGAGATCAACCC
TGGGATTTCTTTGGAAGGAATGATGCTAAAGCTGAAACTCCAGTACTTTGGCCACCTCATGCAAAGAGTT
GACTCATTGGAAAAGACTCTGATGCTGGGAGGGATTGGGGGCAGGAGGAGAAGGGGACGACAGAGGATGA
GATGGCTGGATGGCATCACTGACTCGATGGACGTGAGTCTGAGTGAACTCCGGGAGTTGGTGATGGACAG
GGAGGCCTGGTGTGCTGCGATTCATGGGGTCGCAAAGAGTCGGACACGACTGAGTGACTGATCTGATCTG
ATGTGATCTGAGGGAGGCCTGGTGTACTGCAATTCATGGGGTCGCAAAGAGTTGGACACGACTGAGTGAC
TGATCTGATCTGATCTGATTACTCATTTGATTTTCCAGTTTTAAATGTCATTCATTGTATCTTCACTAGA
AAAGGTGATTTCACTCTTTCCCATTATACAGAAACATATTTCCTATCTCTTCAAATATAGTTACACTATT
TTATTTTAATTTGATTTGACTGTCTATTGTCTTTGAGGAGTGGGGTTGTACTGGGTCTTGGTTACAGGAT
CTTTAGTTGCAGCATGTGGGATCTAGATCCCTGTCCAGGGCCCTGAGTATGGGGAGCTCAGAGTCTTAGC
CACAAGACCACCAGGGAAGTTTCCAGTTACACGATCATTTTAGTTAGATAAATATTTTGTGTTTACATTA
TTACTGTATCAGTGATATTCACACTGAATTATACAATGTGATTTTTACACAGTAATTTTTTCTTTCTGGC
TTATTTTTGCGCTTTTCCTGAAATTCATCGTTGTCCTTGTTTTGTGTAGGTTTCTAAGAACTCAGCCCTA
GTTAAACTCCAGACTTCGTGTCAAGTGTATAAATCTCCATTCAAGATGTTCAGAAGCCTGTGGTGACCTA
CGAATTCTGTCTTTCTGGGAAGTCCCTGCTCCCAGCTGGACGATCCCCCGGAGTGCACGCCCATCACTGC
AGGATTCCTGACACGTGTCGCCTTGGTTAGTGTCCTGCTGTCTTCCTCTTTCTTGTTTCATCCTGTAGTT
TGGGGGACCACAGCCCCTGCCCCAGTAGCTTCCTGAGAAAAGATGCTTTAGAGGTAAATATTTTGACACT
TGAATACTTGGAAAACTCTTCATTCTATCCTTGCATTTTGTTGTTTGTCTGGGTGAGGAATTCTTAAGAT
TCATTACTGCTCTCAGGCTTTCTTTAATTGTGGAGAGTGGTGGCTATTCTCTAGTTGCAATGCATGGGCT
TCTCATTGCAAAGCATGGGCTCTAGGCGTGTGGGCTCAGTAGTTGTAGCACACGGGCTTAGTTGCTCCGA
GACACATAGGATCTTCCTGGACCAGGGATCGAACCAGTGTCCTTTGTATTGCAAGGTGGATTCTTAACTA
CTGTATCACCAGGGAAACCCTGGATAAGGAACTCTAGATTGAAAATTGTTTTCCTTTAGAATTTTGAAAA
TATTGCTCCATTGCCTTAAAAAAAAAAAAGTTACTGTTGAGAAACAGGAAACCATTTTGATCTCTGTTTT
CTTTGTCTGAAAACAGAATTTTTCTTTTAAATCTTTTCTTTCTGTCCCCAGTGTTCTGAAATTTCACAAT
GACCTGCCTTGATGTGGGTAATTTTTCATCTGCTTTGTTGGAATGGTCCCATTTAATGTAGAAACTTTCC
CATCAGGTCTGGGAAAGTTTCTTGAAATATTTCATTGATGATATCCTCTTGACAACAGCTTTTGTTTTTA
TTTTTTTTTCTAACACCCTCTGACTATACCATTTCTGGAAATACCATTTCTCTCACAATATAAAGCCAGA
TCTTATGGTCCTAGAATAAAATCAGGGAAGTAGTGCTGGGAAAAAAATGAACAAAGACATCCCATTAAGT
CTGTCTGGAGTAGGAAGGAAGGCATCTCTGACTTTGAAAAGGGAGCTCCGTGGTACCCTTTTCAGTCCCT
TCCGGGGTCCTTTATGTCAGCCCAGTGCCTGGAAGCTTGGGGATCTCACCCTGTCAGATTGTCTCTGGGC
AGTTCATCTAAGATAACATCAGTGACCCTGGCAGGAGGAGCCCTTTGAAAGGTGAAAACCTGTGACCCTT
GGCCCTCAAGAAGGCGTATCTGAAAGCTAGATCCTTGACCCCAGCCAGCCCTCTTCCTGGGGCTGCCTCC
CTCGAAAGACTGGATCGAATTAGATTCAACCGGTGTGGACGTAGGTGTGGACACCCACAAGGATGGGACA
GAGACAAAACCCAAGAAAACCAGTCTGTGACATCACACACCACTCCAGAAGGCCTGCGGATGGTGACCGC
AGCCACGAAATTCAAAGACGCTTGCTCCTCGGGAGAAAAGCGATGACCAACCTAGACAGCATATTAAAAA
GCAGAGACATTACTTTGCCAACAAAGGTCCATCTAGTCAAGGCTATGGTTTTTCCAGTGGTCATGTATGG
ATGTGAGAGTTGGACTATAAAGAAAGGTGAGCACTGAAGAATTGATGCTTTTGAGAGAAGCAAAAGACTT
CTTTTGAGAAGTCTTGGAGTGTTGGAGAAGACTCTTGAGAGTCCCTTGGACTGCAAGGAGATCCAACCAG
TCCATCCTAAAGGAGATCAGTCCTGAATATTCATTGGAAGGACTGATGCTGAAGCTGAAATTCTAATACT
TTGGCCACCTGATGCAAAGAGCTGACTCATTGGAAAAGACCCTGATGCTGGGAAAGATCGAAGGCAGGAG
GAAAAGGGGACGACAAAGGATGAGATGGTTGGATGGCATCATCGACTCCATGGGCATGAGTTTAAGTAAG
CTCCAGGAGTTGGTGATGGACAGGGAGGCCTGTCATGCTGCAGTCCATGGGGTCACAAAGAGTCAGACAC
GACTGAGCGACTGAACTGATCTGACATCACAGAGCAAAAGTGTTGGATGTTGCCGTGACTGGGGTGGCCT
ACTCCAGCACCGTGGCTTCTATGGGACTCCATGCAGTAGAAGTGTCCTTCCATCCTCACCAGAACTCGAG
AGGAGACTGGAGTTTCAGCAGCTACTATGGAGGCACAGAGTCAGATGCCTGATGTGCCTTCCTGACTAGT
AATCCCAGTACCCAGCACAACGTGAAATCTGCTGACTGGTAAGGGCGCCCTCATGTGTTACTACAGGGTA
ATGTCAACTTGGTCTTCGCAGCAGGGACACAATTCCTCTGGGTATATCTCTTCATCCTGCGTTTCTCTTC
TCCCTGCGTCTGTCTCCTTCTCACTTACCTGTTAGACCACGTATGTGCTCAGAGAAGAAACACAGGCAGG
GCTTCTGAGTGTGACTTCTCTAGCCAGACTGACCATTTCGTCCCTTCACATTCTACACCTATTTACACAC
TTAAAAATTATTGAGGATCCCAGTATGTGTATGTGGGTTGTACTTATTGATTTATACTACATGAGAAACT
GAAACTGAGAATTTTAAAATGTTTATTTATTAATTAATTTGGCCACCTGCTGCAAAGAGCTGATTCGTTG
AAAAAGATCTGGACGCTGGGAAAGATTGGAGGCAAAAGGAGAAGGGGGAAGCAGAGGACCAACTCAATGG
ACATGAGTTTGAGCAAACTCCAGGAAATAGTGGAAGACAGAGGAGCCCAGCATGCTACAGTCCCTGGGGT
TGCAAAGAGTCAGATACAACTTAGAGACTGAACAATAACACTTTATTAATTCACTTAGAATAGCCACAGC
AAACCCATATTAACACAATACTTGAATGAAAACTAACCAGGGTTTTAAAAACAAACAAAGAGTGAAAAGC
ATGACATGGTTTTAAACTTTTGCAAATCTCTTTAATGTCTGGCTTAGTAAAAAACAGCTAGAAAAGAGCT
AAATTTGCTTCTGCATTTGCTCTCTATCCATGTCCCATGTCACGTAACCTCTGGAAAACTCCACTGTACT
CTTATGGGAGAATGAGTGAAAAGGGCAATTAACATCTTAATATTACTATGAAAACACTTTTGACCTCAAT
ATCCCCCTGACAGAGACTTGGGAACCCCTAAAAGATCTCAGACCACTTTGAGAACTGCCAAATTAAGAAT
ATAGTCACAGCGTTACATATTTATGTCAGATCTTTAATATTACCCATAAATGTGTATGCATGCTTAGTTG
CTCAGTTGTATCTGATTCTTTGCAACCCCGTGGACTGTAGCTCACCAGGCTCCTCTGTCTGTGGAGTTTT
CCAGGCAAGAATACTGGAGTGGGTAGCTATTTCCTTCTGCAGGGGATGTTCCTGATCCAGGGATCAAACC
TGGGTCTTCTGCACTGCAGGCAGATTTTTTACCATCAGAGCTACCAGGAAAGCCCTTATAAATATGCATC
AACTATTTAATTAATTAGTGGTTTCTTTGCTTCCTAGTGGCTCAGATGGTAAAGAAACTGCTTGAAATGC
AGAAGGCCTGGGTTCAATCCCTGGGTCGGGAAGACTCCCCTAAAGAAGGGAATGGCAGTCGGCTTCAGTA
TTCTTGCCTGAAAAATCCCTTGAAGAGAGGAGCCTGGTGGGCTACAGTCCATGGAGTCGCAAAGAGTCGG
ACATGACTGAGTGACTAACACTTTCACTTTCACTTTTTAGTCCTTAAGGAAATCATATTTTATTGTTAAC
AAGTAACTTTGCTATGATATACATATGTTGTATGTACATCTGAAAAAGCAATCTATACAGCTTGACCATT
TGAATACTAAAATATTTCAACTTTGAGAACTGCCTAAAAATATACATATGTATAATCACATGGGATTTGC
CTGGTGGTCCAGTGGTTAGGACTCCAAGCTTCCACTGCAGGGAACACAGGTTCGATCCTTGGTTGGGGAA
CAAAGATCCTACATGCTGTGCAGCATAGCCAAAAAATAGAAAAAAGAAGAAGAAGAATATAGTCACATGA
ATCACAGGCCTGGCCATGGATCTAGAATCTAAACAAATTCTAATGGTAATTTTTTGAGGTTAAGGTTCCC
TTTGCTATTCTAGCCAAATAACTGAGGTTGCACTGAGAAGGGCAGGGTTCATGCTCCCATGATGTTCTGG
GCTCTCTGCTTTCTGCTTCCCCGGGCTGCCTATTCAAGTTCTGGAACCCATAGCTTCACCAGGATTTAAT
CACTGTTTGCCCATAACAGTGTCCTGCGATGCCTTATCCCTCAGAGAGATCTCTATTGATGGGGATTTAA
TATACATGGAAGAGCCACCAAAGGGACCTTTCAGGTCAGGAGATTGGGGTGTGTTCACAGCAGTCTTTGC
CCCCCTGGGGCTGACCCCATGACAGTGTCAACTGAGACTTCTGGGAAGGAAGGAGGTAATCCCTGGTAAT
GCCCTGGACCTCTGTGAAGTGGGTTCTAGAGTCTGGCTGGTGGGGTTCTGAGACAGCCACTTACTAGCTG
GGTGACCTTGGGACGTTCTCTTTGTCTTTCGCAACTTCAGTTTCCTCATCTATAAAACCGGACTAATTGT
ACCCACCTCCCTAGTTGCAGATAATAAATGCAAAGCACTGAGGCATTGACTGGGGTCTCAGTAGCCATGA
ACCCCAGCCATGTCACTTCCTGTGTGTGACCTGGGCAACTTTCTTTTAGGAGGTCTCGACCTTCCAAATA
TTTCTAGTCTCTGACCGCTTCTTGCATCTGTTCTGTTATCACCCTTGCCCCAGCTGTCACCATCCATCAG
TCCCTGCCATCTTACTCTGCTCGATTTTACTCACCACGCGGTGGTAGGTGCTCAATGTTTGTTGAATTTA
GTTGAACTGGATTATCTGAACTGTAACTTCTCCATCTTTTTTTTTTTTTGGTTGTGCCACCCAGCATGCA
TGTCTAGTTCCCAACCAGGGATTGAACCTGTGCCCCCTTTAGTGGACACATGGCATCCTAACCACTGGAC
CTCTGGGGAATTCCCCACTTTTCCATCTTTAAAATGAGAAAATTGGACTAGAATTCTCAGGGTCCCACTA
GCTCTAACATCCCATAACTTTCTTTATATATCTATGTCGGAAAATGACTTGTGAATGTATCTATAAATAT
GTTTTTCTGGTCATGTTTAATGAATGATTTAACAGCCCTGTTACAGTCTCCTGATTCAGCAGCCATCACA
GGAGCGGGACCCCTACCCTTCTCAGTGCCAGGCAAGTTCCTTTGGGTGAAATAAGAAAAGGGAACCTGAA
CTCCAGGAAGTAAGCCAGAAAGAAAAACACCAATACAGTATACTAACACATATATATGGAATTTAGAAAG
TTGGTAATGATAACCCTGTATGCGAGACAGCAAAAGAGACACAGATGTATAGAACAGTCTTTTGGACTCT
GTGGGAGAGGGAGAGGGTGGGATGATTTGGGAGAATGGCATTAAAACATGTATAATATCATATAAGAAAC
GAATCGACAGTCCAGGTTCAATGCAGGATACAGGAAGCTTGGGGCTGATGCACTGGGATAATCCAGAGGG
ACGGTATGGGGAAGGAGGTGGGGGGGGGTTCAGCATGGGGAACACGTGTACACCCGTGGCAGATGCATGT
TGATGTATGGCAAAACCAATACAATACTGTAAAGTAAAAATAAATAAATAAATAAATAATTTTCCACAGT
TTGTTGTGATCCACGCAGTCAAAGGCTTTTAGCATAGTCAACAAAGCAGATCTTTTTTGGAATATCCTTG
CTTTTTCTATGATCCAGCAGATGTTGGCAATTAGTTCTGGTTCCTCTGTCTTTTCTAAATTCAGCTTGTA
CATCTTAAAGTTCTCAATTCATGTACTCCTAAAGCCTAGCTTGGAAGAGTTTGAGGATTACCTTATAGCA
TGTGAAATGAGTGCAACTGTACAGTAATTTTAATATTCTTTGGCCTTGCCTTTCATTGGGATTGGAATAA
AAACTGACCTTTTCTAGTCCTGTGGCCACTGCTAAGTTTCCCAAATTTGCTGGCATATTGAGTGCAGCAC
TTTCACAGCATCATATTTTAGGATCTGAAATAGCTCAGCTGGAATTCCATCCCCTCCACCAGCTTTGTTC
GTAGTAATGCTCCTAAGACCCACTTGTCTTCGCACTCCAGAAAGTCTGGCTATAGATGAGTGATCACAGG
ATCATGATTATCTGGGTCATTAAGATCTTTTTTGTATAGATCTTTTTTGTATAGTTCTGTGTATTCTTGC
CACCTCTTCTTAATCTCTTCTGCTTCTGTTAGGTCCTTATTGTTTCTTTATTCCTTTATTGTGCCCACCT
TTGCCTGAAATGTTCCCTTGGAATCTCTGATTTCTTGAAGAGATCTCTAGTCTTTCCCATTCTATTGTTT
TCCTCTATTTCATTGCAATGATCACTGAGAAAGGCTTTCTTATCGCTCCTTGCTATTCTTTAGAACTCTG
CCTTCAGTTGGGTGTATCTTTTCCTTTCTCCTTTGCCTTTTGCTTTTCTTCTTTTCTCAGCTATTTGCAA
GGCCTCCTCATACAACCACTTTGTCTTACTGTTACATTTCTTTTCTTGGGGATGGTTTTGGTCACTACCT
CCTGTACAACACTATGAACCTCCATCCATAGTTCTTCAGGCATTCTGTCTACCAGAACTAATCCCTTGAA
TCTATTCATCACCTACACTGCATAATCATAAGGGATTTGATTTAGGTCATACCTGAATGGCCTCATGGTT
TTCCATACTTTCTTGTATTTTGCAATAAGGAGCAGATGGTCTGATCCATAGTTAGCTCCAGGTCTTGTTT
TTGCAGACGGCGTAGAGCTTCTCCATCTTTGGATGCAAAGAATATAATCAATCTGATTTTGGTACTGATG
ATGTCCACATGCAGAGTCGTCTCTTGTATTGTTGGAAGAGTGTGTTTGCTATACCAGTGTGTTCTTTTGG
CAAAACTGTGAGCTTTTGCCTTGCTTCATTTTATACTCCAAGGCCACACCTGCCTGTTACTCCAGGCTAT
CTTTTGACTTCCAACTTTTGTATTCCAGTCCCCTACGATGAAAAGGACATCTTTTTTTTTTTTTTTTTGG
TGTTAGTTCTAGAAGGCCTTGTAGGTCTTCAGAGAATCATTCCAAGTTCAGCTTCTTCAGCATTAGTGGT
TGGGGCACAGACTTGGATTACTGTGATGTTGAATGGTTTGCCTTGGAAA
>gi|546671514|gb|AWWX01449617.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449616, whole genome shotgun sequence
ATGTTATTATATTTTATTTGATTTTCCTGATAGGTATTTCTACCAAATTACTTCCTAAAGTTATTTTTCT
TTTTTTTTTTACTATTTCCCCCCTTTTTTTTTTTTTTTTTTACAAGCTGCATGAGTTTTGCGTAAGAAAT
ATCCTCTAATGCAAAACTCAAGCATTGACTTATTATTTCATTTAATAGCTATTTATTGTCTCTTAAGAAG
TCCAGAAATTGATGTAAACTGAAAGACCACTGACAAATGATCACCTCAAAGCAGCCGTGACTGCTTCCCC
GTCTAAATACTGAAACATGTCAGCATAAAATCCCAAGGAATTGCATAGGAAGCGAACCTAATTCCGTCTT
GTATGCTCTGGGAAGTTACAGAAAATGAAAATGTGGAATTTATGACCAGAAGGATGAATACCAATTAGCT
AAGCAAAGGGTTAGGGAGGAGGATGCAGACTAAAGGGATTACATTTTAAATGGACCAGATCAGAGAACGT
GAAATTGTGTTAAACAGATTCCCCTTGAAAATAGACTATGAAGGCTGGGGAAAGAGAAGCGTTAAACTTG
GTGGCATAGACAGAACCAGAAAGGACATTATATATCATACTAAATAATTTGGGCCTTTGTACCTCTGAAA
GATTTTAGGTAGGATAACTTTAAAGTCATATTAATGCTTCAAACTGATGACTATAAATGCAGTGTCTGTC
AAGACTGGAAGTTAGGGAATCAAGATAGTAGAGATGTAATTAAGAGAAGTATAAAAGTATAGAACTACTG
AAGAAGAAAAGCCAAAAGTACTGAACTATTTAATGGGAGATGTTGGGGAGAAGAAGGAGTAAGGGTCGAC
ACGTTGCTCATCTGGGAAAGTGAATATACAATGCTGTCATTCACAGGAATAGGAGAAAGGTCAGGAAGGA
CAGAAAATGAGTCCTGGCTTAAATAGGTCATTTTAGAATATTAGTTTATTTTCCCAATGACTTTAAAGCC
CAGGAACTTGAACTAGAACTTTGAAATGCTGAGCATGGAAAGTAAAATAAGAATTGAAGACAGTCATGGC
TTGAACTCTAAATAAACAAATATCAAAGAGAAACAGGAGCCCAGTAGAGCTGAAGATGGTGTGATCATCA
GTAAGAGGTGGAGAGAAAATCAACATAATATGATATTATAAAAAACAAGTTAAGGAAGCATTTGAGAAAG
GAGAGAGTGGTCGTCAGGATTAAAATCTGGAGACAAGAAAGGAAAAAAGTTTCAATACCTTTGAGGTAGA
ATGCAAATTGAAACGAATTGAATAATTGACTAAAAGTGGAGGAAATACAGGTAAAATGGCCTACAAGTTA
AGAAGCCTGATTCTCAAGAGGAAGAAAAAAATTGTGTGATAAAAGTGAATCATATGGGTTCAAAAATAGA
GGAAGTTTTGAAGCTGGAAAAGTTTTGAATTTCTATAGTGATAGTGAAGTCACTCAGTCGTGTCCGACTC
TCTGCAACCCCATGGACAGTAGCCAACCAATCTCCTCCGTCCATGGGATTTTCCAGGCAAGAGTACTGGA
GTGGGTTGCCATTTCCTTCTCCAGGGGATCTTCCTGACGCAGGGATCGAACCTGGGTCTCCTGCATTGTA
GGCAGACTCTTTACCGTCTGAGCCACCAGGGAAGGCCAGATTTCTATAAGGACCAATGAAATCTGTATTA
GTTTGTTAGGGCATCCATAACAAATTGCCATAGGTTGAGTAGCTTAAACAACAGCAACTTCTTTTCTCTC
AGTTCTAGAGGCTGCTGCTGCTGCTGCTACTAAGTCACTCCAGTCGTGTCTCACTCTATGGCAGCCCTCC
AGGCTCCCCCGTAGTGGGGGTAGGTTGCTCTGTCAAGACCAAGGGCCAATTATTTTCTTACCATGAAAAC
CAAGAAGAAGGTGACTACAGGTGATTCAACCTCTAACACATACACATGCACACACAACGTGGACACTCAG
AGAGTTGAGTTAAAGCATAACTATTTTACCTCCAAATTACTGCTAATGCTGAAAAGTACAGGTATTTATC
TAATGTGTTTCAGGGTCATGTGTGGGAAAATTGAATGTTAGGTGGAGTTTAATTTTACCAGCAGCTATTT
GAGATGAACTGTCAGAAAGAAACTTGCTTTTTAAAAAATATATTTGTCTTTATTTAGCCTCTCAAAGATC
AGGTCAGCAGGCTTCCTTGATGGCTCGGAGGCTAAGAATCCACCTGCACTGCAGGAGCTGCAGGAGACGC
AGGTTGGGTCCCTGGACTGGGAACATCTTCTGGAGTAGGAGACAGCAATTCACTCCAGTACTCTTGCCTG
GGAAATCCCATGGACAGAGGAGCCTGCTGGGCTATAGTCCTTGGGGTCATATAGGGTCGGACACAACTGA
GCACTCATAAACACCCACAGTCTCTCAAAATTCAGAAAGTTGCCCATTCTAAAGGATATTGTTCTATGTT
GAATATAATGAAATCCACGCTTTGTGCTCTGAGACAATGGGAATATCCTTAAATTTGGCTGATGGGGCGT
GAAATTCAATTACTGTTCATTCAAGTATTGTTTGAACTATTCCTTTGTTCTTTTATAAAATTTACCATAT
GGAAAGGAAATACAGTGTCTTCCAGTTGAAAGATATAATTGGCATTAATATCCTATAATTAAATTTACTT
TATTTTCCATTACTTTTTGAAGACTTATATCCAGAAAGACAGAAGAGAGAGAGAGAGAGAGAGAGAATAG
TAATGAGCAGAATAAACTTGAATAACTCGACTAGATGGCTTATATCAACTTGATGCTGAGCAGTACCCTA
TTGGACTTCCCGACATGTCCTGCATTTCTTGCAGGCAAGACTCCAGCCTCCATGACTTTTCCTGAGTTCC
AAGGGGCAGATCAGTTGCTAATCAAGGGAGCTCCAGCAACGAAACTACCTGGGGCAAGATTTAAAAGGCC
AGGGAAGCTCATTAAGATTAGAAGACTCACCACCCAAGACTCTGTGTTGTGTATTCAGTACTCAAGTCAT
GTCCAGCTCTTTGGGACCCTACGGACCGCATCACACCAGGCTTCCCTGTCCATCACTGTCTCCCAGAGCT
TACCCAAGTTCATGTCCATTGAATCGGTGATGCCATCCAACCATCTCATCCTCTGTTGCCCCCTTCTCCG
ACTCAGTCTTTCCCAGCATCAGGGTCTTTTCCAATGAATCAGCTGTTCAAATCAGGTGGCCAAAGTGTTG
GAGCTTCAACTTCAGTATCAGTCCTTCCAATGAGTATTCAGGGTTGATTTCCTTTAAGATGGACTGGTTT
GATCTCCTTGCTGTCCAAAGGACTGTCAAGAGTCTTCTCCAACACCTCAGTTCAAAAGTGTTAATTCTTC
AGTGCTCAGCCTTCTTTATGGTCCAGCTCTCACATCCATACATGACTCCTGGAAAGAACATAGCTTTGAC
TACACGGATCTTTGTCGGCAAAGTGAGGTCCTTGCTTTTTACTACACTGTCTGGGCTTGTCATAGCTTTC
CTGTCAAGAACCAATCGTCTTCTAACTCCAGGGCTGCAGTCACTGTCCGCAGTGATCCTAGAGCCCGAGA
CGCGGAGATCGGTCACCTCTTCCACCTTCTCCCCTTCCTTCTATTTGCTGTGAAGGGATGGGGCTGGATA
CCATGGGCTTAGTTTTTTTTAATGTTGAGTTTTAAGCCAACGTTTTCACTCTCCACTCACCCTCATCAAG
AGGCTTTTTAGATCCTCTTTGCTTTCTGCCATGTGAGTAGTATCATTTGCATATCTGAGGTTGTTGATAT
TTCTCCCAGCAATCTTGATTCCAGCTTGTAACTCATCCAGCCTGGCATTTTGCATGACTCTACACATTCT
AATCTTGTCAGCAACCTCTTCCTCTTGGAACTTTGCAGCAGAAAGAATGCAAAGTTACTGTTACTGCCTT
AATCCTTATCACATAACCTTGCCTGATTATATAACCTCTCCTTACTCACCAGGGAGAGGGGCACAGTTCT
TGGGGCGCTAGGCTACTGTGTTCCCTCTTTGCCTGGAAAGTAATAAAGCCATTCTTTCCTTCTTCTTTCT
GTTGCTGTATTTCTGTTTAGCAGTGATGCACAGAAAGCCAAGATGTTAGCAAAGAACTCAGCAAGTTTTT
CAATTTAATTTTTATAGCAATCATCTATAGTATAAATAAGTTAGCAATACATTTCATCATGAGATCATCT
CATTTAATTACTCTACCAAAGATAGAGCATCTCCTCTGAAGTATCTGTGTCAAATGATACATCATATTCT
CTTAAACCAGAATCTGCCTTTGTGTTAGTTTCTCTGGGATCCAAATAACTTCATCCTGCCTGCAAAAGTG
AAAATACTGGATCAATCTTCCCTTTGAAACCTCCAATTTCCACATGATGGATTACAGTTATTTTGGAGAC
TAGCTACAATTTTCTTATCAGCCAAGAAAAATATAGAATGTGTTATTGGTTTTCATGAAAGGACTGCAGC
ATGATTTTTATAAACCTGCAATTTCAACATCTATTAAAAATGCTGGGGCTTAATTTAGACTTCTCTTGCC
TTAGAATGCAAGCCATGAATCTTACAGAATTGAATTTATAAGCAGCCATTATTCATTAGAAGGTCTGTTG
CTAATGCTGAAGCTCCAATACTTTGGCCACCTGATGTGAAGAGATGACTCATTGGAAAAACCCCTGATGC
TGGGAAAGATTGAGGGCAGGAGGAGAAGGGGGTGACAGAGAATGAGATGGTTGGATATTGTCACTGACTC
AGTGAACTTGAGTTTCTGCAATCAATGGGAGATAGTGAAGGATGGGAGACTGGCGTGCTGCAGTCCGTGA
GGTCACAAAGAGTCCAAAACATAGTGACTGAAGAGCAAATTGTTTCACTTAGAAACTTCACAGATACCAA
GCACAATGTTTGCAGATGAACAGTAGAACCTTCCTTATAGCCTTATCAGGGAGTAGGTGTGGTCAATAGC
AATTAATTTCAGTTTGTTTTTTATGCTCAAAAAATAGTCAGGGTAAAAGACTGTTTTGAAGGAAAGCATA
AATAGTAGATAACTAAAACAAGAAAAATAAATAGTACATATGTTTTTATATCGATTATCTAAATTCACCT
GACTTTAAGTATGTTACAGTATCAGGCTTTTAGTTAAAACACAAAGTCAACATAAACCAAAGACCTCAGA
CTAGCCACTTACATGTGATTTAAATAACATAATTAAGAATCATTTCACACCCCAAAAGGGGTAATTGAAA
GACTGCTAAAACATTCTAATTTCAAATTGTGCCAAGTGTTAAATCCAATGCAAAAATAGCACCAAGCTAT
AATCCCAAATTTTCCTTATGTCAGTTCCCATTTTTTTTCTTGCATTCATCCTGTAGAGAAGAAAATAAGA
ATTTCTAAGTTCTCACACCAATGATGTTAGTTATATGAGTTTTGGTGCAATTCTGAATATTGATACTGCC
TCATATTTTCAGACAAGTCTTCTGAGTGCTCCCACTGTAAATTATCCATTCCTCCTGACCTGTAACTCTC
ATGTCACTGGGCTGTAATTTCTCAGAGCACTCAAAACTATCTGAAAGTGTAAAATTCATTTGATTATTGA
TGATTTTATTATCTGAACACCCTTCTCTAACGTCCCCATGAGAGTGAGGCTCTCAATTTATTCACTTCTG
TATCTCTGGTTCCCAAGATGTGATGGTGTATATTGTTGTTTAGTCACTAAGTCGTGTCTGACTCTTTCGT
GACCCTATGGACTGTAGCGCACCAGGATTTTCTGTCCTTAGGATTTCCCAGGCAAGAATACTGGAGTGAG
TTGTCATTTCCTCCTCCAAGGGATCTTGCCAACCTAGGGATTGAACCCACGCCTCCTGTATTGCGGGCAG
ATTCTTTACCACTGAGCCACCAGGGAAGCCCAATGGTATATGCTGTGCTGTGCTTAGTCGCCCAGTCGTG
TCTGACTCTTTGTGACCCAGAAGGAAATGGCAACCCACTCCAGTACTTTTGCCTGGGAAATCCCATGGAC
GGAGAATCCTGGTAGGCTACAGTCTATGGGGTTGCAAAGAGTCAGATACAACCAAATGACTTCACTTTGG
ACTGTTGCCCACTAGGCTTCTTTGTCCATGGGGATTCTCCAGGCAAGAAGATTGGAGTGGGTTGCTGTGC
CCTCCTCCCAATGTTATTTATTAAGTAGCTAATAATGATCTGGAATAAATCAATTATTGTTTTTTCCTTA
AAAACATTTACAGTCCTGAACTTCTTCGTCAGGACTCAAAATGTTTACCTATGAAATAATACCTTTAATC
ACAAAATAAAATTCTCCATCTAATGGAAACTGTTGACTCATATCAATTAGGGCACCGTGACCATGGGCCA
AGCTCAGAAGTAACTGAGTTGGCTCTACTCCACAAACTTCTTCCTCACACATCAATACCCTTAAAAATCA
AATTGAATAAGAACACTGAATAATAAAAGTGGAATCCATCAAAAGAAATTGCTATGTTACTTTTATTAGG
AGAAATCATCTAGAAAAGGGTGCTGTTTCTAAACATATGATGAATTTAAGTAAAGATGACTAATGACAAT
TAACAGCTGGCAACTCAAGAGTTATTTGATCTCAAAGATAAACATAGTACAGGAGCAAGTTTTCCGGAAA
TCTAATTCCAAGCAGTGACAACCATTAGAGATCACATTAAAAAAAAAAAAAAAAATCCCAAAATTACTTA
AAATTCAGATATGGCTTTAAAAGTGTCATAATATGTAAGACTTAATTTTTTTCCAGCAACCTATACCAGA
CACCTATCTTCTTAAAAAAGAGCAAAAGATATTGCTGATTTAGAAATAGAAAACATTGAGAATTATGTAT
TTTAGATGAAAGATTATATGACTAAATTTTTCTTGTTATACTTACTTTATTTTGCAGATGTAAATCATAA
ATCTGAGGTCAGAGTTTTGTTTTCTGCATCATTCTGTGTTGTTTTCCAAACACTGGAATTTTGTACAGAA
AATGGGAAGTAGATGGCACTCTTGGACTTGCAATCAGATGCGGGAAGCCTGTGCAGGATGGAATTCAAGC
ACCTAACTCGGCAAGACCCGGTCTCTCACTTCGTTTACCAAAACTATTTTTTTAACTAATAGCTTCTACT
CTATGCTGCTTCTCCCTCTTACAGTGTATGTATTAAGAGACAAAAGTACTTTTTAGCTATTGACAGTCCA
ACCTGTGCTTTTCCCTTTCGCAGTGTATTAAGAAAATCATATTGCTGAATCATTCATACAGGAATTTTTT
TTACAAATATAGCAAAATTATTAATAAAATTATTTAAAATGAAGTGGAGAAATTTACAATGGGTCAGGCG
GTACATGTTTTTCAGTTCTGGGGAACATATGTCTTGGCTTGGAGAGATCACATTTACAGAGAAAATACGC
TTTAGATGTGATTATCCAGGATTCCCACAATAAGAGAAAGCATTTATCCACACTCAATAATCCTCAGTTT
GTTCATAACACTCATTTTTTGATGCAGGCCATATATATTTAATTGCCTCTTTTGATTGAGATAACTGCAT
GTTTTATCCCATTTCTTCTCCTTACGTCAAAAACGCTGCCAAATTGTTAGCAGTCATGAGAAATTTCACC
TTAATCTCCAAGTCTTTTGGTACCTGTGACCCACACATTCTCTCCTGAAAGGAAATCTAATCATTTTCTT
CCTTTCCGCACAGGGACCTTCTGTGTCCTGCCATATCACACGTCACATCACAGTCATGTGTGAGCACGTG
TCATCCCCTCGCTAGATTCTCGAGAGCCACGGTCACGAAGCCTCAGTCTAGAGACCCCAGAACCAAAGCC
AGCCACCTGTCCTTCTTGGTCTTAGGTGAGAGGCCTCCCACCAAAGGCCCAGCAAGGACCTACCTGGAGA
GTGGAACAGTTTGCAGAGCTTCCATCTTCAGCCCCTTTCCCCCACCCGTGATCCCGTTCTCCTCTCTCCC
CCAGCAGGTGAGCAGCCTCTATCTCGCCCTGCCGTGTCACTTGGGCTGGTCCTCACTAGACAAGGTAAAC
ATATTTGTGTGTGCCGAGTTGTTCAGTCGTGTCTGACTCTTGGGACCCATAGATTGCAGCCTGCCAGGCT
CCTCTGTCCTAGGATTCTCCAGGCAACAAGACTACAGCAGGTAGCCTTTTCCTTCTTCAGAGGATTTTCC
CGACCCAGGAATCGAACCCTGGTCTCCTGCATTGGCAGGCGGATTCTTTACAGACTGAACTACATTGCTT
CTTGACCAAGGTAAATATATAATTATCTGTTTAAAGGCTTGAGAATAGAAAGATTACTACTAAAATACAC
AGGCCTGAAACTTGCATTAACAAACTGAATGCCATTTAGCAGGATAACACTTAACCTGAGATTCACATAT
GAGATGAAGAGAAGTTATTGGTCCTTTATCATACAGCCTCAGTTACCCACTACGTGCACACGTACTGAGC
TCCTCAGGGAGATATATTTCTATGAAATGATGGAAGATAGAGCGATTGAATTTTGTGTTTAGTTTTCTCT
TAACTGAGAAGAAGAACTGAATACAGAGATGAGAGCAGACTCTCTCTCCAAAGAGAAAATCTTCAGATTC
TCCCACCGCTAATCTGTAGGAAAGAGGCAGGAGAACCCACTTCCGGAGGTCACAGTATCTCTACCTGCTA
GAAAAAACAGCCCGATTGGAGATTTTACTGCCGATACATCCACAGGAACCCTGTTCTGTAGGATGGGAAC
CCTTATTCCTCTCTTGGAGGATGGTCCAAGGGGCACTGTGAAAGAAGCAACGATTTATGGTGTGAATTTG
GGTAAATTAGAAAAACCTCATTAACTTCATAGAGATTAATACCTTCATTAAGATCTGAAGGTCAAATGAA
ATAATGTTCCCACTCACTCATTCTCTCATTTATTTGATAAAAAATTAATAACTAATTGATTACTAAGTGC
TATTCACCATTCTAAGCATTAAGGATATTGTCTTTAAAAAAAAGAAAAAAAAACATTGGTATTAGTGTAC
TGGCTATTGAGTTACTAATGAATAGTATTCATTCTGTAAATAGCAGTGATTTATAGGCACACATTTGTTT
CCCTAGAATGCTTACATTTCTTCTAGTATATACAAAGTATAAATTCATAATCTTTACTTCATAAACCCAG
GACCACAGATTTAGTACAGGGGTAAATCCAAATTAAATCTTACACATATGAAAGAAAAGATAAATATTAT
TTTAGACATGGCCATTTTAAAGATATCTGGTTTTCAGTATTTTAGAAGCAACTTAAGATGATAAGAACAT
ATTATTACTCAGTCTGGTTTTTAACTGCTTCCTCAGAGCTGGGAAAAAAGCAGCCCCGCCTTCTATTACA
TATATATACTAAACTAAAACATGCCCTGTGCTTACTCAGTTCTTGATAGTACCAGGAGCAAGTGATATTT
TTTACTTTAAAAGAAGTCTTTCATATATTTATTTTTAATTATGAAGGATTTCAAACATGATAAAAGTACA
GAGAGCAATAAAGCAATGTGCTCACCACCCAGATTTTAACATCGCTTCACCATATTTACTTTGGGATTTT
ACATACACATTTATTATAAAATGTGAATATGTAAACAGTAATACACTCTATTATATAGATTTTTGAATAC
ATAAAAATAAAATTTATATAAAGTGTTACAAAAAGGATTAAAAATCTCTCTCTCACCATCGACCTGTCCC
CTCGTGAGGAAAGCATGCACATCCTTCTCATTATGATTTATAACCTTATACATATTCACTATCTATAAAT
AACAAACACAATTCCTCTGTGTATTTCTAAGATTTACACCATAATACGATTTACCTTCCTTTTTTAACCT
CAATGTTAGATTTTAAGATTTATCAGTTCTGAATGGAATTAATTAAATTTCTTTCTACTGTATAAAAGCG
CCATAGTAAGGTGTTCCTTGTGTATATTGATAAACATTTATTTTTTCACTGGTTTTTGCTTTTTCAAAAA
AGGCTAGAATGATCACGCTGGTATGCTCACCTGCACATATGTTCAGGTCTTTTAAATAGATTTTAATTCT
GTAAGTAGAATGGCAGGATCAAAACACAGGCAGAGCTCAGCTTCACTAGAAATTGCCAGCTGCTCCTCTA
AACAGTGCAAACATTTACATTCCTACAAGCCCAGTAGGAGAGGTTTTGATTCACCACGTTCTAGCTAAGA
CTTGACATTATCAGAGTTTTAAGGTATGCCGATCATGTGAAAATCATATTTTGCTGCTGTCGTGACTTCC
ATTTCCCTAATAAACAGTAAGATCAGGTGTTTTTCATGTGTCTATTGGCTATTGGGGTTCCCCTTCAGTG
AACGCCTGTGTTTTTTATTGATTTCATAATTGACTTACTTTCTCTTGCTAGTTTATCTGTACTTTTCTAA
GTTTTAAACTTTTACTTTTTACTTTCAGATTTTTAATGTTGTCTATGCTATTGAGAAATTTGTATGCAGG
TCAGGAAGCAACAGTTAGAACTGGACATGGAACAACAGACTGGTTCCAAATAGAAAAAGGAGTTCATCAA
GGCTGTATATTGTCACCCTGTTTATTTAACTTATATGCAGAGTACATCATGAGAAACGCTGGACTGGAAG
AAACACAAGCTGGAATCAAGATTGCCGGGAGACATATCAATAACCTCAGATATGCAGATGACACCACCCT
TATGGCAGAAAGTGAAGAGGAACTCAAAAGCCTCTTGATGAAAGTGAAAGTGGAGAGTGAAGAAGTTGGC
TTAAAGCTCAACATTCAGAAAACGAAGATCATGGCATCCGGTCCCA
>gi|546669842|gb|AWWX01450698.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig450697, whole genome shotgun sequence
ACGGGGGACAGCACTTCCGCCTCTGCAGAGAAGGAGAAGGGGTCAGCGGGGCCTGGACCCTCCCCCCCGC
ACCCAACCAGGGGACGGGCCCGACTCACTTTGAGCGACCACCCTCACGGTGCCGACGGTCTTTCCTCCCT
TGGGGTGCTGGACTTCGCACACCAGGTAGTCATCTGGCCCTTGAAAGGCGCTTGAGGAGGGCAGGACCAC
CTGAGAGGAGGCCGACCACAAGCCGCCCCTCAGGACTTCGGGGAAGGTCCGGATTTTCTCACTGCTGACC
GTGCTGTTGTTGAACTTCCAGGAGAAGCTGACAGAATTGGGCACGAAGTCCCGGGCCAGGCAGCCCAGGG
CCACCGTGCTCTCATCAGACGGGGAGCTCACGCAGGACACCAGGGGGAAGACTCTTGGGAGCGATTCACC
TTCTGGGGACCCGAGAGAGGACACAGGAGAAGAGGGGGGTGAGAGGTGTCCTGCTGGTAGGGGGTGTGGG
CAGCTCCACCTTCTCTCTGGGACAGTGGAGCGGAGGGCACACTCAGCCCTGCCAGCCCACCCTCACTGTC
TGTGATTACCCACCTGGGGCCTGCCCTGGGGGTCTGGGGTCATCAATAAGACTGATACACACTCAGGCTC
CCAGTCCTCAGCACAACCAGATCACTGAGGTCAGCCCACTGTTGACCAGGACAGTCCAGTGCGGTCAGCT
CAGTCCATCTAGACCCACCAGCCTCAGTGGAGGTTAAATGCACCCAAAGCATCTCAACAATTTGCCCAAG
TCAAGCCTGCTCAGTGGGTTCACTTCTGTTGGCCCAGTCTCAGTGCACCATGGTTAACCCAGCATACCCC
AGTTAAGCCCAGGCTAGCCCAGACCAGCTCAGCCCAGCTCAGCTCAGTTCAATCCAGATCAGCCCAATCC
AGGCCAGCTCAACCCAGCTCAGTTCAGCTCAGCTCAGCTCAACCCTCTCAGCCCAGCTCACCTGCTCAGC
CAGCTAAGCCCAGTTCTGCCCAGCTCAGCTCAGCCCAGCTCATCCACTCTGCCCAGCTCAGCCCAGCTCA
GTTCAGTTCAGCCCAGCTCAGCCTAGCTCACCCACTCTGCCCAGCTCAACATAGCCCAGCTCAACCCAGC
TCAGCTCAACCCAGCTCAGCTAAGCCCAGCTCAGCTCAGCCCAGGCCAGCTCAGCCCAACTCAACTCAGC
TCAGTTCAGCCCAGCTCAGTCCAGCTCAACCCAGCTCAGCCTAGCTCACCCACTCTGCCCAGCTCAACAC
AGCCCTGCTCAACCCAGCTCAGCTCAGTTCAGCCCATCTCACCCACTCTGCCCAGCTCAGGCCAGCTCAA
CCCAGCTCAGGCCAGCTCAACCCAGCCCAGGCCAGCTCAACCCAGCCCAGCCCAGCTCACCCACTCTGCC
CAGCTCAGCCCAGCAAAGCTCAGCCGAGTTCAGCTCAGCTCAGCCCAGCAAAATTCAGCCCAGCTCAGCC
CAGCAAAGCTCAGCCCAGCTCAGCCCAGCTCACCCAAGCTCAGCTCAGCTCAGCCCAGCCCAGCCCAGCC
CAGCTCACCCACTCTGCCCAGCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCTCAGCTCAACAAAGCCCA
GCTCAGCTCAGCCCAGGTCAACCCAACTAAGCCCAGCCCAGCCCAGCCCAGCTCACTCATGCCACCCTGC
TCAGGCCAGCTCAACCCAGCTCAGGCCAGCTCAGCCCAGCTCAACCCAGCCCAGCCCAGCTCACCCACTC
TGCCCAGCTCAGCCCAGCAAAGCTCAGCCCAGCTCAGCCCAGCTCAACCCAAATCAGCCCAGCCCAGCCC
AACCCAGCCCAGCCCACACACTTGGCCCAGCTCAGCCCCCTTCAGCCCAGCTCAGCCACTCCATTCAGCT
CAGCCCAGCTCAACCCAGCTCAGCCCAGCTCACCCACTCCACCCAGCTCAGCCCAGCTCACCCACTCCAC
CCAGATCAGCCCAGCTCACCCACTCTGCCCAGCTCAACACAGCTCAGCTCAGCCCCCCTTAGCCCAGCTC
AGCCACGCCATTCAGCTCAGCCCAGCTCACCCCAGCCCGCTCAGCCTAGCCCAGCTCAGCTCAGCCTAGC
CCAGCTCAGCTCAGCCTAACCCAGCTTAGCCCAGCTCACCCACTCTGCCCAGCTCCGCTCAGCCCAGCTC
AGCCCAGCACAGCCCAGCTGAGCCTAGCTCAACTCAGCTCAACCCAGCTCAGCCCAGCTCAGCCCAGCAC
AGCGCAGCCCAGTGTAGCTCAGCCCAGCGCAGCTCACCCACTCTGCTCAGCTCTGCCCAGCCCAGCTCAG
CGCAGCCTAGCCCAATTCAGCTCAGCCCAGCTCACCCACTCTGCCCAGCTCCGCTCAGCCCAGCTCAGCC
CAGCCCAGCTCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCGCCCAGCTCA
GCCCAGCTCACCCACTCCGCCCAGCTCCGCTCAGCCCAGCCCAGCCCAGCCCAGCTCCGCTTAGCCCAGC
CCAGCCCAACCCAGCTCACCCACTCTGCCCAGCTCAGGGCAGCTCAACCCAGCTCAGGCCAGCTCAACCC
AGCCCAGCCCAGCTCACCCACTCTGTCCAGCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCCCAGCTCAA
CAAAGCCCAGCTAAGCTCAGCCCAGGTAAACCCAACTAAGCCCAGCTCAGCTCAGCTCAGCCCAGCCCAG
CCCAGCCCAGCCCAGCTCACTCATGCCACCCTGCTCAGGCCAGCTCAACCCTGCTCAGGCCAGCTCAACC
CAGCTCAGGCCAGCTCAGCCCAGCTCAACCCAGCCCAGCTCACCCACTCTGCCCAGCTCAGCCCAGAAAA
GCTCAGCCCAGCTCAACCCAAATCAGCCCAGCCTAGCCCAACCCAGCCCAGCCCATACACTCGGCCCAGC
TCAGCCCCACTCAGCCCAGCTCAGCCACTCCATTCAGCTCAGCCGAGCTCACCCACTCTGTCCAGCTCAA
CACAGCTCAGCTCAACCCCCCTTAGCCCAGCTCAGCCACGCCATTCAGCTCAGCCCAGCTCACCCCAGCC
CGCTCAGCCTAGCCCAGCTCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCG
CTCAGCCCAGCGCAGCCCAGCTGAGCCTAGCTCAACTCAGCTCAACCCAGCTCAGCCCAGCTCTGCTCAG
CCCAGCTCAGCCCAGCACAGCGCAGCCCAGAGCAGCTCAGTCCAGCGCAGCTCAGCCCAGAGCAGCTCAC
CCACTCCGCCCAGCTCCACCCAGCCCAGCTCAGCGCAGCCTAGCCCAATCCAGCTCAGCCCAGCTGACCC
ACTCTGTCCAGCTCCGCTCAGCCCAGCTCAGCCCAGCCCAGCGCAGCCCAGCTCAGCTCAGCCTAACCCA
GCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCACCCAGCCCAGCTCAGCACAGCGCAGCACACCTCACC
CACTCTGCCCAGCTCCGCTCAGCCCAGCCCAGCCCAACCCAGCTCACCCACTCTGCCCAGCTCCGCTCAG
CCCAGCCCAGCCCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCTCCCAGCTCCGCTCAGCC
CAGCTCAGCCCAGCCCAGCCCAGCTAAGCACAGCCCAGCTGAGCCCAGCTCAACTCAGCCTAACCCAGCT
TAGCCCAGCCCAACCCAACCCAACCCAGCCCAGTGCAGCCCAGCTGAGCCCAGCCCAGCTCACCCACTCT
GCAGCTCAGCCCATCTGAGCCCAGTTAAACTCAGCCTAACCCAGCTCAGTTCAGCCCAGCTCAGCCTGGC
TCAGCCCAGCTCAACTCACCCACTCTGCCCAGCTCAACCCAGCTCAGCCCAGCTCAGTTCAGCCCAGCTC
AGCCTGGCTCAGCCCAGCTCTGCTCAGTCCAGCTCAGCCTGGCTCAACCTGGCTCAGCCCAGCTCAGCTC
AGACCAGCTCACCTGGTTGGCCCAACCCAGCTCAGTTCAGTTCAGTTCAGCCTGGCTCAACCTGGCTCAG
CCCAGCTCAGCTCACCCACTCCGCCCAGCTCAGCTCAGCCCAGCTCACCTGCTTGGCCCAACCCAGCTCA
GTTCAGCTCAGCTCAGCCCAGCCCAGTCCAGCACAGCTCACCTGCGGTTGGTGGCCCGGGCTGCCCTCAC
AGACGTGAAAGCCCAGTGGTCCTGACAAGAAAGGGTCAGATCCCGGACCCGTGGCCTCGGCTAAAGCCCC
TGGTCTGCAGACGCTGCCCAGCTGGGCTCACCCCTCCCAGCCTCTTCCCGCTTCTCCTGGGTGCCCGACG
CCTCCATCCCCACACCAGGCCCAGCTGGCCCTTCTCCCAGCCGTCAGTCACCACCACCCTCCACTCTGGG
TGAAAAGCATCGTGATGACTTTAGCTTCCCTAGAGCATCTCACAGGCTGAGACATGCTTGCCACCCTCAG
ACAGAGGCCCTGTCTCTGATAAGCAGGCAGCGCTACTTCTCTGGGAGAGGAGAACCTGGGCACACGTCCC
TGGGGCCTGGCCACGTGCCGAGGGCCTGAGATCCTGCCCCAAGTCTAAAACAGTCCTGGTGACTAACTGC
TCTCTGGCAAATGTCCTCATTAAAAACCACTGGAAATGCATCTTATCTGAACCTGCTCCCAATTCTGTCT
TTATCACAAAGTTCTGCTGAGAAAGAGGATACTCTGTAGCACAGAGCGACCATCTGAACCCCAAAGCTGC
ATTGAACACCTAAGTGTGGACGCGGCAAGTGGTCCCTGTGGATGTGAAGCACCCTGGCATCGCAGGCAGT
AGGTAAAGGCAGATTCCCTTTCAAGTAGAAACAAAAACAACTCGTAGAAACGTCCCGGGGCAGCGAGTCT
GGCTGCACCGGCTCCTGCCCCTCACAGCTCGGCGCCTGGTCCCTGGCACGTCCCGTGGGCTCTCTGACCT
GGGCGGATTCCTCCGAATCCCTTCGCTGCGCTAACTCGTGACCTGCCCGCTGGCCTGGCGGCAGAGGCCA
GGCCCACACGTCCCCAGGTGCGGGCGCTCCCAGGCCCCGCTGACTGCCACCCCACCGGGCATCCTCTCAG
TCCCCCAGCTAGTGGTGTAGCAGAGTGACTCATGACGAATGCCCCCGTTTCACCCAAGTCTGTCCTGAGA
TGGGTACCCGAAAGGCGGCCCTGAACATTCTGCAGTGAGGGAGCCGCACTGAGAAAGCTGCATCATTGCC
AGGCAGGAGCCGGCCAGCTACGATTGTGAGCACACTCAGTGCACACGGCATGCGCACGGTCTCAGCTTAA
CTACCTTGAAGGAGTAACTCATTAAAGAGTGTACCAATGCATTGATAAAGTGCACCTGAGACAAATTAAT
TTCTTAAACATCGACTTTGAAAATGAATATAAGTGAGCAGTTGATAGCCTCTGAAGGAAATACATTCCAA
CAGGTGCTGAGAACCCCCAGGAGCAGGGAACGGACTCCCCGTGGAGCCCCAGAAGGAGCCAGCTCTGCTG
ACACCTTGGCCCTGGGCCCTCCCTCACGCTGGAGAGAGCCAGCTCCTTTTGTTCACACCTGGCCTGTGCT
TCTTTGTCGTCATGGCCCTCAGACAAGCCCACAGGTCCTGACCTCAGCCCCTCAGCCTCCGTGCAGCCGT
CCCCCTCCCCTGCTGGAGGCACCCTGCCTGCCGTGGAGCCCCTCACCCAACATTCCCCTGCCTGATGGGT
TGGGCCGCAAAGGACAGCGTTTAACCAGAAATGCCTTCCAGGAGCCTCCTGCTGGGAGACGGCCTTCTCT
GGGGACCAGGTCCACTCCCACTCCCTTGGACAGTCACTGTCAGGCCCCTGACGGCCCTATGAGAGGCGTC
CTGGGAAGCCCCAGTCTCCTTCCTGCCCCTGAAATTGCCTCCCTGGAGAGCCAGATCACCCTTACCGAGC
TCCCTGCCCTGGCCCCCGGGGTGTCCTCTCCCGTCCCACCGCCCACCCTACCCTGGACCTCCCCGGGGCC
CGAGCGTGCCGGCGCCCCTGTCGGCCCCCACCTGGACCCCCGCAGCTTATCTCTGAGGGCTAATTCCCCT
GTCCCCTGTCCCGCTGCCAGCTGCCCCCTCTTTCCAGGCCTTTCCTCCGTGCCTCTCCAGTCCTGCACCT
CTCTGCAGCTTCACCTGACACTTCCTTTCACCCTCCAGGCACCGTCTTCTGGCCTGCAGGTGAGGTCTCG
TGCTCCCTCAGGGCATGGTGTGCTGCACACACACCGGCCTCCTCCCGAGTCCCTCCTGCACACACCACGC
GAACCCGAGGTTGACAAGCCCTGCCGTGGTTGGGGTTCCGGGAATGGCGGCAGAGAGGGACAGGGTGTCC
TTGGGGCTGGTGGCAGGGTCCTCCCGGATGCACACAGAGGCCCCAGCTCAGGCCACCTTGGGAAACCAGT
CCTGGGATCTGCAACTCGGCCATGTTCCTGCATCTGGACCAGCCCCAAGACACCACCCTGGCGTGGCGCC
ACTGGCCTGGGAGGAGACACACATCCCTTTCCCATCAGCAGTGGGTTCAGCGCTCAGGATATGCAGCCCA
CAGGAGTGTGGCTTGGGGGAAAAAAACCTTCACGAGGAAGCGGTTTCACAAGATTAAGTATACTTGTTTA
TTTCAAGGCCACAAATGCGACATTGCAAAGCAGGGCCAGGTGGAGCCTATAACTGCGGGCTCCATGCTTG
GCCATGGCACTCAGCTGCTCCGAGAAAAGCCAGTTTCTTACAGCTTGAAGCTGGAGAGGACACAGGGAAA
ATTTCAGTACAAGCAGACAAGCCACATAGGCCAGGGACCGGCCCCAAGGCAGCCCCTCATGTTCCGTCTC
TGGTGCCCCATGTTCTTGGCTGGCCATCACTTCACCTGCAGGTGACAGAGACAGTGTGAGTGGGTGCAGG
GGCGCTGGGGGTCTGCGGCCCCGGGGCTCTGTGGCCTCACCCCCTCCCCGGGGCCAGGCAGCCTACCTTG
AACAGGGTGACCGTGGTGCTGTAGAAGAGGCTCAGGAGGAAGAGCACGATGAAGGTGGAGGCCATGGTGT
TGAGGTTCTCGAAGCCTTCCTCCTCGGCGCTCACCTCCCCCTCTGTGGGAGATAGAGCACGGTGGTCAGC
ATGGCTAAGCTACCTGCAG
>gi|546669977|gb|AWWX01450566.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig450565, whole genome shotgun sequence
AACCATGAAATTAAAAGATGCTTACTCCTTGGAAGGAAAGTTATGACCAACCTAGACAGCATATTAAAAA
GCAGAGACATTACTTTGTCAACAAAGATCCGTCTAGTCAAGGCTATGGTTTTTCCAGTAGTCATGTATGG
ATGTGAGAGTTGGACTATAAAGAAAGCTGAGCGCCGAAGAATTGATGCTTTTGAACTGTGGTGTTGGAGA
AGACTCTTGAGAGTCCCTTGGACTGCAAGGAGATCCAACCAGTCCATCCTAAAGATCAGTCCTGGGTGTT
GATTGGAAGTTGATTGGAAGTTGTTGAGTTGAAGCTGAAACTCCAATACTTTGGCCACCTGATGCGAAGA
GCTGATTCATCTGAAAAGACCCTGATGGTGGGAAAAATTGAGGGTGGGAGGAGAAGGGGACAACAGAGGA
TGAGATGGTTGGATGGCATCACCGACTCAATGGACATGGGTTTGGGTGGACTCCAGGAGTTGGTGATGGA
CAGGGAGGCCTGGTGTGCAGCTGTTCATGGGGTCGCAAAGAGTCAGACACAACTGAACGACTGAACTGAA
CTGAACTTATTCTCAGAGCCTGCATCATTCTCTTCCCAGACTGCTCTGGGCTTCCATGGTGGCTTGGATG
GTAAAGAATTTGCCTACAATGCAGAAGACTCGGGTTCAATCCCTGGGTTGGGAAGATCCCCTGGAGAAGG
GAATGGCTCCCTACTCCAGTATTCTTGCCGGGAGAAGCCCATGCACAGGGGAGTCTGGGGGACTACAGCC
CTTGGGGTTGCAAAGAATTGGACACAACTGAGCAGCTACCACCTTTTCACTTTCAGGTTGCTCTGGGGTG
CTTCCCAGGTCTTCAGAATGCTCCGTATACCCATTGGTATTTTTTCCTAATCATGGAACAGTTGTTGCTT
TGTTATTATTGTTGTTATAATCGTTTTCATAGGTTAGTTACCTAACAATTAAATTCCCATTAACACCAAT
TTTTTTTCTCATATAAGAATCCTAATTTCTTCAACTTTTTCTATACATACCCTCTTCTGCTTTTCTCCTG
ATGTCCATGAATATCCTCTAAGAGGCAGCTACACTCTTAGTCTTCTTCCTTTTGCCTCATCAAATGTAGG
GGGTGCTAGAAGGAAGAGGTACATCACACATTTGATGCAGACTTCACCTGCAGAGAAGTTTATTTGGAGC
TGCAATTGTGGCATTCAGAAAAGGCTTATCAGACACAGGTTTGCTATAAAACACTTGTTATGACACTTGA
TGTTTGATTTCTTCTCGAAACAGAGCTCCCGTGTCTCAGAAACCAAGCATAGTTTTCCATTTGGTCCTCC
ACCAAAGAAAATACTGTCAGTTCCAGGGAGTCAAATACAAGTCTGTCTGTCCCAGTCCATTAAAAAAAAA
AATTATTCAACTGCTATTATTAGTAGATGTCAGTTTCTAAGGGCAAATCTGAAGAGTGAAAAATAATGTG
TGCAGTTACTTTACATTGATATAGGCATGGCTATTTTTGTCTATCAAGGAGGGGAAAAAAGACTACTTCA
TGTATGTTCCATGCATGCACACGATGGGAGCCATATTTCTCTGTCAAGCAGTGAAGCCAGTTGGAAAGCA
AGGGCCAGTAATAGAACTGGAAAGTCATGCTGATGCTTTCTGTGCTTGTGAGATTCTAGCTCTGTGGACT
GGGATTTTACAATTCCCAATTGCAAAACTGTTCGTGTATCTGATGATCCATGCCATTTCAGGCAGGTGTT
TCTCTGTGGTCCAGTTATTCCACCTGTGAAATGAGTGGTTACAACAGGACTCCTGGGTAGTTTATTAGCA
AAATGGAGTCATTTTTGAGCAAGATATCTGCAGGAAAAGCTGCTAGGAACCTTGACTATTCTAGAAGTGA
AGCAAGTTTTTTCATTTTTATTTTTTAGGAGGCCCAAGGACAGTGTATTGTGGAGAAGGAGGTTATGGTT
CTCATGAACATTCTCTGTTTCCTTTGGACCTTTTTGTTCTTCTGGTTCTCAGAAACTTCGAGACTTGCTT
TCCTTTTTCCTCTCAGTGGAGGGCAATCTTCTCTGAGAATTAGCCAATACCAACACTAGATAGTTCTCTA
AACAGCATTGCCTTTTTATTGTGGGATCTTGCAGATATGAGTGTAAATATAGGAATTTCCAACCATAGAA
TGGCAGTTACAAGGGGCGGGGGAATTTCCAGTGCCTCATAATTGATCACTGATGCCATCCACTATGAAGG
CCAGTCTTGGACCCAAGAATATACTGAAAGATGGGGATGATTTCACCTGGAAGCATTAGCAAATTCTGGA
ATATTCTTGATGTCAGAATTAAAAGCAGATTCTAATTTTGGCTTCATGGGTTCGTGGGTCGTCCTGTTTG
GAACCTTCCCTGGAGATCCTTTATATGTTTACTTCTGTTCTGTTTCCCTGGCAAGGCGTGATGTAGATTT
CGTTACTGCTTTCTGAAGCTCTGGAGTAGCGGGAAAGGTTTCCTCTCCCTTTTGCAAACTCCTGTGAAGG
AGCTTAGTAGCAGTACCAGGGCATTTGTTTTCCTGTAGAGTGGATGGGGGTGGACAAGAGCGGAATGAAC
TTCCTCGCGTTCATCAGTGAGAGTCAAGTCCTCTCATCCTATTTACTATTTCTTACACTCTAAGCCATCA
TTTCTCATAAGAGATTTTTATGATATCAAAAGCAAGCACAAACCCGCAAATTGGTTGGGCATAAAGAATA
TGTATTACAAGGTTACTCCTAACTGTGAGAATCATTAAGCCTTTTTTTTCTATGAGATAATGTGGATGGT
CGCCTATGTATGGGGTTGGCCAAAAAGTTTGTTTGGGTTTTTCCACATGCTGGTATAGAAAACTTGAATA
CACTTTTTGGCCAACCCAGTAAGGGCTTTGCCTCATCTCTGTCTAGCCAAATTGCCACCTTCCCTGCTAA
GCTCCACATCCCAGAGTGATCACCTTCTAAATCCCTTCCTCCTATCAGATATCAGATACCTCGAACCTAG
TCATGTACTTATGTGAAGTTTGTGTTGTTACCTTTTTAAGCAGTTTACATTGTATTGGATACACATTGTA
ACTGCATGACATTTCTGCAGGGCTCTATTCTTCTGGTCAAACTGAAGATCTGACAGCTATGAAGCTTTCT
GGCGTCCCTGGTGGCTCAGATGGTAAAGAATCTGCCTGCAATGCAGGAGACCTGGGTTTGATCCCTGGGT
TGGGAAGATCCCCTGGAGAAGGGAATGGCAACCCACTCCAGTATTCTTGCCTGGGAAATTCTATGGACAG
AGGAGCCTGGCAGGCTACATACAGTCCATGGGGTCACAAAGAGTTGGAAGCGACTAAGCGACCAAGACAC
ACAACTCCTGAAGTTCCCTACTTGGCCTTCTGTTTGGTTCACTTAAATATGTTCGCAAGAGATATTTATA
ATATATAATGCAAGGCAGGAAACTGGTATTCATGCATAAGAAATAGTGTTGCTGCTGCTGCTGCTAAGTC
GCTTCAGTCATATCCGACTCTGTGCAACCCCATAGACGGCAGCCCACCAGGCTCCCCCGTCCCTGGGTTT
CTCCAGGCAAGAACGCTGGAGTGGGTTGCCATTTCCTTCTCCAATACATGAAAGTGAAAAGTGAAAGGGA
AGTCGCTCAGTCGTGTCCGACCCTCAGCGACCCCATGGACTCCAGCCTTCCTGGCTCCTCCATCCATGGG
ATTTTCCAGGCAAGAGTACTGGAGTGGGGTGCCATTGCCTTCTCCAGAAATAGTGTTAGGCATTATGTTA
TATGATTTCAAATTTCATTTCATTTTTGCAAGCATTACCTGAGATTAGATATTGTTATCTCTGTTATCTT
CAGAAGAGAAAAGCAAAGCTTTCTTTATTTTTTTTTTTTATCCATAGCTACCCTTGGGGAGTGTCAGGAC
CAGGGTTCAAAGTTATGACTGTAAACTTCAGAATCTGTTATTTCCCACTACTCCAACAAAAACACTGGAG
AAGTCACAAACGTTTACAATGCCAGACTTATCAAAGACATTTTTTAAGGTGCAAAATATATTTAGATAGA
CTCAGTAATCTTTAGTTAAAGAATAAAATCGATGTGTCAATAACATTACTTTTCTAAGCTCTTATGTAGT
ATTTATACCCTCTTCTCCCCACTTCTTCCCTCACACCCACCTTTGCCCCACTCAGGCTGTAGGAAGTTCC
CTAACTTTCTATTCATATTTGTCCCACTGAAATTCTTATCAATCCTGGACTTGTTTCCCACACACCTAGT
GGATCCGGTTTCCATTTGAGGTATATAGTTCTTAAACATAGAGCAGATTTTTTCTCTGAAGTAATTGAAC
AGGAGAAGTTGCAGACTCAGAGAAATGAGATATTATCATAGGTGAGGGTGAGGTAGTGGATTGGGAAAAG
GGTTTTGAGGGAAAGTTAGCTTAGCTGGTAAAGAATCCGCCTGCAATGCAGGAGACCCTGGTTTGATTCC
TAGGTTGGGAAGATCCCCTGGAGAAGGAATAGGCTACCCATTCCAGTATTCTTGGGCTTTTCTGGTGGCT
CAGATGGTAAAGAATCTGCCTACAATGCCGGAGACCTGGGTTCGATCCCTGGGTAGGGAAGATCCCCTGG
AGAAGGGAAGGGCTACCACTCCAGTATTCTGGCCTGGAGAATCCCCTTGGACTCCAAGAGTCCCTTGGAC
TGCAAAGAGATCCAACCAGTCAATTCTAAAGGAAATCAGCCCTAACTATTCATTGGAAGGACTGATGCTG
AAGCTGAAACTCCAATACTTTGGCCACCTGATGCGAAGAGCTGACTCATTTGAAAAGACCCTGATGCTGG
GAAAGATTGAAGGCAGGAGGAGAAGGAGACGACAGAGGATGAGATGGTTGGATGGCATCACTGACTCAAT
GGACATGAATTTGAGCAAGCTCCAGGAGTTGGTGATTGGCAGGGAGGCCTGGCATGCTGCAGTCCATGGG
GTCACAAAGAGTTGGACATGACTGAGTGACTTTCACTTTCTGCCTAATAATGCCTAGGACACAGGGCTAG
CTTTCCAGGCACAGTGCTCAGGAGGGTCCCGTGCTTTACTGAATGCCACCATCTTGAAATCCTTAATAAT
TTTATTTTGAACTTGTGTTTTGTAAGTGAAGCTCAGTGGGATGATGGAGCATGATCATGAGCAGAGGAGC
TAGGCATAATGTGCGTTTGAGAATAGTTTTTGTGATGCCCCAGGAGAACAGAATTCCAATGAACCCATAT
ATGTGGTAGTTGAGCAAGACTCAAAATTGTAATACAAGGTGAGCAGAGCACATTAGCCTATAAGAGAGGA
CATTGAGGCACATCCCAAGGGACTGTGCTTTCTGTTCAGATAAATCAGAACTTTCAAATGCAGAAAAAGG
CAGCTGCATTCTAAGAAACACAGCCACCAAGGAACCGTATCCTGCCCTTTCTTATTCCTGTTACTTTCCT
GTATTAGCCAGACCACCTACAATAAAGCTGAAGAAAGAAGAAAGGGGAAAGATTGGGCAAGGCAGGGTTC
CCTTTCAGTTCTATCTTACTCATCAGTGAGCTCAAGGTAGAGTATATGGCTACAGAATATCAAGAAGTGA
AATGAAAATAATCTTGTGTGTGTGTTAGACACTGTTTTGGCAAGAATGAAATATGTATATACAAGTACAC
TCCATGAAACAAAAATTATGTAATTTTGGTGATTCTGCCTAAGGGTTAAAGTTCTGATAATTGCATTTAG
ACTTGGCATGGTACAATATAAAGATGAACAGTAAAATTCATGCTCATAATGTAAAATTTTATTTATTTTT
TACTTAAAACAACTTTAAATAGCATCTTAAAAACACCATGACCAATTGAGAGAGATTACAGAAAAAAGGG
AAAAGCTTTCTATTTTGTTAACGTTAATGGCATTTTTCCTGCTTTTGGAACAAGGGGCCTAGGTTTTCAT
TTTGCACTGGGTTGTATAAGTCACCTAACTGGCCCTGGTGGGATGTTTGTCTCTCCCTGAGGTTCAGCTG
CCAGGATTTTACCCCGATGATTCTACCCTGGAGAAAAGAACTTCCATTAATTTAAGTGAAAGTGAAAGCT
GTTCAGTCATGTCCAACTCTTTGCAACCCCATGGACTATACAGTCCATGGAATTCTCTAGGCCAGAATAC
TGGAGTGAGTAGCCTTTCCCTTCTCCAGGGGATCTTCCCAACCCAGGGATCGAACCCCAGTCTTCCGCAT
TGCAGGCGGCTTCTTTACCAGCTGAGCCACAAGGGAAGCCCAACAATACTGGAGTGGGTAGCCTATCCCT
TCTCCAGTGGATCTTCTTGACCCAGGAATCGAACAAGGGCCAGGGTTTCCTGCATTGCAGGCGGATTCTT
TACCAACTGAGCTATCAGGGAAGCCCCCTCCATGAACTTAGGAGTTCACCAGTATGCAGCGGACCTTCCC
TGCACTCATTCTAGAGAGAAACCCACAAGGGTATGGGTCCTCCCACATGCACTGGATTTCTTATTGGTTT
TAGGCTCTCACTTTATGAACAGATAGGCAAGGACCTACAAACATCACAGCAAAATTTCCAGCCTTTTCAT
CATCCTGCTACTCTTCTCTGGCTTAGAAGATGCCCTGCTTCTCGATTCATTGCAGATACAGACTCAGATG
CAATTTTTGTAATTGCATTTCTTGTCTGAAAGTGGTCTTCATAGAAAATACAACTTCAGTGAGCTTCTCA
AGAGAGAAGCTGTGATCAGGTGAAAAAGTCAGTCTAAACTCTGCATTTAACACAGTTGATGAAGTCATTA
ATTCCAATTTTTGGAGAAATATGTTCAAGCTGGGCATAGAGGAAAAGAGGATTGAAACAGGTGGCTGTGA
TATTTGGGGGAAAGCCCGATTGATTTTAAATTGCAACTCGGAGGGAAGAACAGAATGTTGATACTAGCAG
TGTCTGGCAGGCAGCTCAGTGGAATATTAGGTCTTCCTAGAATCAGGCTTCATACTAAACTGACAAGCCT
CATTTCTGAAGCTTGGAAATCCACCAGAGGTGTGGGAAGTAGTATGTGCATGGCTATACCCTGAAGCTGG
CCAAAAAGAGTTCTTGACTGTCCACTGTCCTGGGCTGTCTGCGGGTGACTTGGGGGGTGTGTGTGGAGGA
GGTTAGCATATGAGAAGAAAAGCAGGAGGATTTACAATCAATTTAAGAGAAACGAGACATATTTCTTACT
CTTAAATAAGTTAAACAGAAGCTTTCTGGGAAGGAGGGCATCTTCTAAGTCAAATAATCCAGTTGGCTTT
CTGCCTTTTGAACCTTATTTGTCTCGCTGTACCCATTCGTCTGAGTCTATTCTGTGGGTCTACACAGAAA
AATGTAACCCTCCTCCCCATGATGGCCCTTCCCACTTTTGAAGGCAGCTACGGTGTGCTTTGTAATCCTC
TCTTCTGAGAATCAGAGAGCCCTCTTTCCTTCAGCCTTTCCTCATTGCAAAATTTCCAGCCCTTTCATCA
TCCTGCTACTCTTCTCTGATTCTGCTCACTTGTCAAAAATTTATTGTGCACCAACTATAGGCCAGGCATG
CCTAGAGCTAAAGTACCAACATAAATAAGATATAGACCCTCCCATGAAAGAGTTCTGGGGCTAGTGGGTG
AGACAAATGAAAAAACTGTATACGGTAATAAGTGCTATAATACAGATATGCATGGGAGATGTTATGGGAG
TATAATTTGCCTTTTTGTGTAGAAATAAGCAGTGTGTTTCACGTAGGGTTTGACTGGTGCAGAGCAGTGA
AGGTCTCTCATATCTTAGAATTTAAAAAATCTCATCTGTAACCCAACTTATAAGCCCCTTTTCATACCTA
ATGTTATTTTTTTATGACTTCCCTTTTGATCTCTTTTTCTGTATCATTGCTGTTGCTTAGTCGCTAAGTC
GTGTCTGACTCCCTGCAACCCCCTGGACTGTAGCCTGCCAGGCTCCTCTCTCCATGGGATTTCTCAGGCA
TGAATACTGGAGTAGGTTACCATTTCCTTCACCAAGGGATCTTCCTGACCCAGCGATCCAACCCACATCT
CCTATATTGGCAGGTGGATTCTTACCACTGAGCCACCAGGAAAGCCCTTTCTGTATCATACTCAACAATT
GTCTGTTTTATTGGTTCTTATTTTTCATTCACATATGCTTTAATTTTTATCAATTTTTCTACTTTAAAAA
ACTGAGTTATCATCTCTAGTTTTCAGCCTTGTTGAGCTCTAGCACATTCTTTTAAGTCGACAGATCTCCT
TTGGGTAGTGCTTTCACCACACTGCGTGTGTTTTGGTGGAGGAGGTTGTTCTGTGGTTGCTCAGTTATAA
ACGGTTTGTAAAGTTAGGTTTGTCTAACCCCAAAGTTCTCCCAATGACTCTCTTCAGATACTGTCGTGCA
GAAGCCCTGCACAGTCTGACGTCTCTGCTAAGAGGACATGGGGAGCGATCGCCCCTGCGGAGGGTCACTC
CTCTCACAGCATCCCTTTGCCTTCCTCACCACTTTGAGAGGAACAGCAACACCCTCCGCGCCCCAATTAG
GGCACACAACGAGGGGAGCTGGCAAATGGGCCTGTTTTTCTTAACTATGTTGCCAACATGCGAGAATGAG
TTCTGGAGAGAGACCCTTTGCCAGAACAACTAGTTAAAACAGGAATATGATTGTGTCTAGATTTCATCAC
AATAGGATGAGAAGCATCATTAAGCAAAGTGGGAAGGCGGTGGACTGGTGTAAATCGACTCTCCTACTCA
TTCAGAGCAGGAACTGTTCAGTGAGCTGCAGGTGAGCCTCATAAAGCTTTGTTTCACGTGGATGCCTGCT
GGGCTGGCGACTGAGAAAATAATGTAGCCCAGTTTGTTACTCTTAAAATAGCCTTTTATTATCATCCTTC
CTTCAGGACACAGTGGTTTCATGATCCTTTTCTCCTTTTCCCTATAAAAATGCACCCTTAAGAAAAGATC
CAGAAAATCCAAGATAGATAACAAGATAAAAATGATTGATAACAATATTCAACCTTACCCAGACTGACAT
TCATTTTAAAGGGACACAAAACCTTAGGCTGAAGGAAACCAGCATGGATTTGTAGACTTTTCATCATCAG
TAGTAGAAGGGACAGTTTGGATGGAGGTTCTAGTGTGGATGGGGCGTGGCTCGAGTCTTAAATGAGCCAA
GAGGACTGGGTATACGCAAGGCAGTAGTTCTCAAGCAAGGAATATCCGCCCCTCCCCGAACCCTAGGGGG
ACATTTGGCAATGTCTGGGGATGAAGGCAATGGCACCCCACTCCAGTACTGTTGCCTGGAAAATCCCATG
GATGGAGGAGCCTGGTGGGCTGTAGTCCATGGGGTCGCTAAGAGTTGGACACAACTGAGCGACTTCACTT
TCACTTTTCACTTTCATGCATTGGAGAAGGAAATGGCAACCCACTCCAGTGTTCTTGCCTGGAGAATCCC
ATGGATGGAGAAGCCTGGTAGGCTGCAGTCCATGGGGTTGCACAGAGTCGGACACGACTGAAGCGACTTA
GCAGCAGCAGCAGCAGCAGGGGATGTTTTTGGTTGTCTTAACTGCGGTGAAGGTTACTAGGGGCATCTAG
TAAGTAGAGGCTAAGGATTCTAGACAGGCCCCCACAACCAAGAATTATCTGACTGCAAAACTCTGTACAG
GAGTTCCAAGGCTGGGGATACCTATTGTAGATGGACACAGATTTGGTAGTGACTCTGGCCCTCAAGTGGG
AGGGGCAGCAGGAGAGCCTAGGGGTGCCAGCCGGAGACGTATGGCCTTCTCTGAATAAAGGAGTCCCCAC
TGCTTAGCAATCGAAAGTCACTGTCCTACGACCCAGTATAGACAGATCTGACTTCTTTGGAGAATTCTGA
AATCCGGATTTTTACATGAATGGCTACATTTACAAACACCAGCAATGAATCCAAATGAACAAGGTTGAGA
AAGCCACACAAAAGACGACCATGAGCTGCACTTGGCTTTTGGGTCGCTACCTACAAGCTGTCACCCAGAC
AGTCATGTCCTAAGAAGTATAGTGCAGGTTGCCAATCCAGGGCTGCAAGGTACCCGGGGAAGGGTCTGGC
CACTAGCCCTTCTGTCTGCAAAGGCCACGACTGTCCTACTGAGGCAGGAGGCACAGAGAGGTCTGTTTTA
GCTCTGTGGGGGAAAATAAAAAGGTATTGATGATCTAAAATTGCTCTATCATTATTCAACATAGTTTCCC
AAAGCAATTTCACCTTTTCCTCTCGTAAAATTCTCTTGGGCTCAATATGCACATTTTGGATTTTGGACTT
CTACTAATTCAAATCATGTCCAAAAGATTAAATAAGATAATACACAAGCAAGTTCAGAATGTAGTATTGT
CTGGCACTAAGAGAAATGTCAGTTGACCTTCACTATGGAGACACTGCTTAACTATGTGAACAGGACTGTA
GATACTGTTGTTTAACTGGATAGAATTAGTGACCAGCTCCTGGGATAATGAGCTGAGTTCACCCGCTTGA
ATTTGGAGGGCAAACACATGTTCTTTGTGTCTGGGAGGTATGCTTGAGGTTTCCTTTGTCTCTCCCTGCT
TTGGGGAAAAGCAAATTTGGCATGGCCTCTGAATGTCTGCCTCCTCCACGCTTTTTTTTTTTTTTTTTTA
GTATAGAAGCTTTGTAAATTCGCCTTGTTCAAAAACAAGCACCATCTTCATACCTGCTGAAAGGTAATTC
TGATCCAGAGGCTGTGTAATGAACAAAGGGGGCAGTAATACCAACGGGCGCTCAGACAGCTCAGCCCACG
CAGGTACACGGGCACGGCTGCACCGACCGGGGAGGGAGGCCCTTTGTAAAGATCCCTTTTTATTGTGACT
CTTTCACCTCCCCCAGCCCAGCCTGAGCCATGAGAAGAAGAGCGATGTCAATGAGACAGGCCAATTTTAC
CCATCTTACACAATCGTGCATTACAGACTTTCCAAGAAGACGCCATTCGGGGTGAATCACCAGATTCAGG
ATGAAAAGGTTTTGATTTGCAAAAGCGCTCAAGCATTTCTCTTTAGAGCCGGAGATGGCAGGGAAGGAAC
CTATGTGCATGCAGCTGCTTCACCTTTGGGCCAGGTGCACTGGCAAGCACTCCGTGGTGGAGCAGGGAGA
AGGCACCCAAGCATTTACAGGCGCCTTTGTGACACCCTGGGTGCCTGGGCCTGTGAATGCCTTATTTAAG
CCAGCAAAGAATGAAAGGGTGCTGGTGACGGTGGGGGTATTGAGACGTGTGGAACTGAGTCAAAGCATGA
ACTACTTGTTAACATCAACCGAAATTAATGTTTAAGCTTTCAGGTTTCAATTGGATTTCATCAACCCAGG
AAAGAGGTGATAAGAAAGTGCTAGATTGAGGATTCTATTATTTTGGATGATTTTGGCTACTGATAACAGA
AAATTTATGTCAAATGACTGAAACCCTTTATTACTCAAACTTTGGTCTCAGGACCGCCTTCACCATCACC
CGGGAGCTTGTCAGAATTGCAGAATCCACAGCTCCAGTCCAGACCTGAATCAAAGTCTGCATTTTAACAA
GACACACAGGTTATTTGTATGCACTTTATAATCAGAGAAGCACAGCCTTCAACAAGAACTCTTGTGTTTT
TCTTTTTCTAAATCTCAGAGACCAAGTCGGCTAGAGGTAGGCAGTTCCAGGGTTGGGTTAACTCAGCAAC
TCAAGGATGTCGAAAAGGACCTGGGTGTTCTTCATTCTTCTGCTCTGCTCTGCTAAGTGTGTTTACTTTT
ATACTTGGATTTATCTTCCTGTCGTCACAGGTGGCTGCAGAAGCTCCAGGCATTACCTTGCACAGCTGAG
TGAAGGGGTGGGAAAAAGGAAGCTTCCTCTCACACATATGCCTTTTTAACAGCAAAGAAACATTTTCCAG
AAGCCACTTAGTAGATTTCCTTTCATGTTATATGTCTCATATCTGGCTGTCCAGAATTGCAATTTCTTAC
CCATTCCTAAACCAATCATTGGCAAAGGATTAGAATAATAATCATTGGCTAAAGCCAATGAATATTTAAC
CTCTGGGACCAGGTAGGAGTCTGGGCTTCTTCGATGCACATGGCGATTAAATATATGAACAAAATGGATT
TCTGTAACTTGGAAGAAATGAGACTGCATAGGAGAGAGACGGTTGGTGGGAAAATCAACAGTGTGTGCGG
CAAGAATTTTTTAAAGGCTGGTCATGTATATTTTGGGACATTAGGAATTTCCTCGTGAGTGTTTGTTTGA
GAGTTAAAAAAGAACCATCTGTGAGGGTTCCCTTTATGTCTGAGCTAAGAGATAGTGTTCTAGTGACGTG
TATAGCCTTGGATGGTTTGTGCTCCAGGAAGTTCTTAATAATAGAATAATGCAATTATGTTTTCAAACGA
GACAGCTAGAAGGGGCATTTTAGACTACTTCGTGTGTCGTGCCCATACAGTTTATAAATGAACAAAAAGT
CCAAATGACCCATACCACACCTTGGTCAGTTGTGTGGAGACCTCGATAGTCTCCTTTTATGTGCGACCAT
CTTGATGGAGGTAGTTCATTCTTTCAATTTTACATCTGGACTTTAAAGAAGTACATCATTTTACTTTTCT
GGGGGAAAATGAATGATTATGTGGAGTGAATCACATAGAGGCCATTTTAAGACCATCTAAGAGAAATAGG
ATTTAATTAAGCCGGTCAGAGGATCCCAGCTGTCATGAAACCATGAGACCCCAGAGAGCAGGGTCTATCT
GATACAATCTGTAAGCACTTGGCAAGTGTTTGTAGAACAATTAATGGAGTGGATAAATGAATTAAGTGTA
ATTATATCTTTCTCTAAAGAAAGTTGTCATGTGACCTACCTTTACCATTTGCATTCTTATTTGCTGTATG
TTGTTTTTAATTGACTTTCTCTGAATAAATGTGATTGAACCAAAGGATTCCTGGGAATCCAGGATATTTT
CTTTCATACTTTTATTTCTGAGGGTCTGAAAGAGAACAGCAAATCCAACAGTAGGCTCCATTAATAAAGG
AGGTGAGAAAACTGAAGCTCAGAGAAATTAGTTCAAGGACAGCTACTTATCTGTGAAGTTGGGATTCAAA
AAGGGAACTGTTTGACTGAATCCATGCCTTTAACATATATAGTAGTTACTCTATCTAAGCATTATTATAT
GCTAGGTATTGTTCTGGGCTTCACAGATGGCTCAGTGATAAAGAAGCCACTTGCCAATGCAGATGTGGGT
TCAATCCCTGCGTCAGGAAGGTCCCCTGGAGAAGGAAATGGCAACACACTCCAATATTCTTGCCTGGGAA
ATCCCATGGACAGAAGGCCTGGGGGGCTACAGTTCATGGGGTCACAAAGAGTCAGACACAACTTAGCGAC
TCAACAACAACAATAGGCATTGTTCTAAGTGCATTATTCCTGTAACCCTTGCAAAAAGCCTATGAGGTGG
GAAATGTTTACTATTCCACTTTACAGGTAAGGAGACTAAAGCTTAGAAAGGCTGAGTGACCCATCTGTAT
CCCCGTAGCTACTCCTGTTAGCCCTGGGCATGAACTTATGCCTGTCTGATCCCAGAGCTCCAGGCCCCAC
TATGATATTATTGCTTACGGACGGTGGAGCACAATATCGTATCCTATGTCGCTCAAGAGAGCAGGATTAG
CATTTAGAAGAGCCTTTAAGAATTGGTCTGTGCCAACAGCAGACTTGTCTGCCTCAACTACAAATGAGCT
GGGGTGGCAAAAGTCACTGTCAAGAAGGCATAGGCAAATTTACTGCATGTAAGTAGAAGTAGGATTGGTA
ATTTCTATAGTCTTTTTCAAGAATAGAGTTTTATAGTATAAATTTTTTTAAGGCAATTTTGTTTTCAGTA
CTGAGCCATTCACTTAATGGGAAATTATTCTTTGCAAGGGAAAGAAATAGGTAGAAAAAGGCGACTTTCA
TCTCCTTGTCTTTTCTCTCCCACATGCCTACCCCACCTCCAGCTCTTACACTCAAACCCCTACGCTTGCA
GTTTATTGTAGATGACATTCTTTTCCACATCTGCAAAAGCAGTGCCCACACTTTCTTAAATAAGAATTGG
ACCAGTGTTTAGATATATTTACCACCTCTTCCCCAAACACCAATGTCTGATTATCTAAAAGCCCACGTCT
AAAACTTGTGAGCAGTAGTTGAAGCTTCCATCAATGCAAATCTTAGATAACAAACACTTCACTGCAGCGA
GTGGTTGCTTCATTTCAATGAGGTTAAGATGGTAGTCAGTGGAAGCTCTGATTGGTTTTGAGAAATGGAC
ACAAATGTTTCTTTTTTCAAATTGGGGTATAGTTGTTTTACAGTGTTGTGTTAGTTTCTGCTGTAAAAGG
AATGAGTCAGCTATATGTATACTTATAATCACCTCCTGGACTCACAAACCCCCCCAATCCTACCCATCTA
GGTCATCACAGAGCACCAAGCTGTGTTCCTCCTGCTAAACAGCAGGCTCCCACCAGCTATCTCCTTTACA
CATGGTAGTGTATGGAGAACAGTATGGAAGTTATGTAAAAAATCTAAAAATAGAACTACCATATGACCCA
GCAATCCTACTACTGGGCATATACCTAGGAAAACCATATTTCAAAAAGACACATGCACCCCAGTGTCCAC
GGAAGCAGTATTTATAATAATCAGGACATGGAAGCAACCTAAATGGCCACTGACGGTTGAATGGGTCAAG
AAAATATAGTACATATATACAATGGAATATTACTCAGCCATAAAAAGTTGTAGAGATGTGGACGGACCTA
GAGTCTGTCATACAGAGTGAAGTAAATCAGAAAGAGAAAAACGAATATTGTATATTAATGCATGTATGTG
GAATCTAGAAAAATGGTACAGATGAACCTATTTGCAGGGCAGGAATAGAGATCCAGACACAGAGAACCAA
TGTGTGGACCCAAGACAGTGGGGAAGGGGGCGAGTGTCGGATGAAATGGATACAAATTTAATATTTGCCC
AGAGAAATATTTTCAAACGTCTCAGTCTAGTTTGAGTCATCATTCTCCATCCTGGTGAATGACGCTTCAG
TAACTTGAAACCTTACTAAGTGACATCATCCTGTCTTCGAGATAGTGAGCTTGGGCTCAGAGGACAAGCT
GCTCAAGAGGATCCTGGGCACTTTCTGTTTTCTCTCTGTGTCTCTCTTGCCTTTTTCTTTGTCCTGCACT
CTCCTTCCTCCCATGGTCTACTTTTCACTTTGAAAATCACTGATTTCAAAAATGCATCTGTAGCTGTTAA
TTTTGCGTGTATCTGATTTTATGGGTTTGTGGAGCTCCTGAAGTTAAATTAATTTTAAATGGATTTTAAT
TACTAATGCTTTGGGCCATTGCCTTTAGCAAAGTCAACAAAACAATCAGGGTTACCTTAGTTTGTCTCTT
TGAGGAATGTGCTCACCCGTTTGGGCTCTTACTAAACAGAGCAGCTTTCTAACCAGAAACATAAATAAAT
CCATGGCTATATTTGAAGGTCTGAGTTTATTTATTTTTCAAAAACAGCCACCGTTGATTTAAGAAATAGG
ATTTTAAATGTGCACCAAAGGATTTTTGTTTGTCAATAAAACCTCATTATTTTTCAAAGGTGGAGAGGAT
CCTAGTCATAATAGTATGCTATGTAAATGCTAAAGCAGAGGCTAAATTATGACCCAAAGGTTTGAATTCT
CTTCTTAGCCCCACACAGAGTGCTTTGTATGTAACAGTAGAGAAATCAGGTTCCTTCTCAATACAGTATC
CCCTTGTAAATTAAGGGTAATCAACTCCTAGTTGAAAAGTGAAGGCATAATGAGATAATGTTCAAAATGG
CTTTTAAGTGTCTACCTATAGAGGTTGAAGGTGAAAAAAGAACATTGCCATGTCTGTGAATACCTGCCTC
CCAGGAGGGAAGGCTACTCAATGTTGTTCCCAGCGGAATGTTCTGTTCAATCAGGAGCCATCACTTGAGT
GGTTTGATAATAATAAGCTTAGATTGACTGCAGATAAAAATAGGAAAGGTTAATTAAGTCATTTAGTCCC
TTGACAATTTTGCAAACCAAGCGATTTCTGTTAAGGATCAGCCTTAGACTGTGTAGTACACAGTCCAGTC
GAATCACCGAGGTTCTAGTCTCCACCTTCCACATCTTCCCCCAGAAAAGAGCATAAGATTTTTACTATTT
TCTCTGAAAATTCATATCAAATTTCCTTCCTCTCTATATATGCTACTATTATTAATGACTCCTAGGTGGT
CTCTAATACTTCTTTTATCCTTCTTAGTTGAAAACGTGCTAAAATACACCCAGGTAATAGAAACACCTGA
GGTTGGCAGCTGGTTAATTCATTAATCAGTTAAATTCATTGTTTAGCTTGACAGATATAAGACTCTAAAC
AGCATGTTAAAGCTCTATGTCCAAATATATGAATATAGAAGATAATTAGAAATAAAATAGTGATTTAAAA
TTAAAACTCCAAACATAAACTTCCTGAGACAAGTACAAGAAGAGAAAAGGACCTTTTTTTCCCCTCCAGT
TTCTCTCTAGAATTGACTTTCCACCCCTATTTAATACAGCCCTCAATAATCCTGTTTAACTCACAGTGTA
TCCAAATGTGCAAAGCCAAATAAATATACAAGCAATTATTTATTAATAGCAAGAATGTGTGTGTCTATAC
TTTTACAGTTTCCAGAGTATTTTAGACTCATTCTTTAAAATAATATCTGCTTATTTTTCTCAAGCTTTAA
AAAATAATCTAGGTAATTGTTTCACCCTTATAATGAGATTCATCTTTCATTCATGTAATTCTCGTTTGGT
TCATGTGTACTCAAAAATGTAAAAAGTTGAGGATGTTTGAGAGCACCCCCAAAGTGCAGATTCCAGGGTT
TTGTATAAACACTGCCACGGGCACGCGCACACACACACACGCACACATTCTGCTTCTTTATCTTTGTCTT
GTTTGGTATTCCCGGTCAGAAATTACTTTCCTTTCAAATCCCCCTTGATAACTCAGATTCTATGTGATTG
CTTTCAATGGTTGCCCACCATTAAGAAGTCTTCTTTAAAGGCAAAGAGACTGGATAGCAAAAAAGATGAA
TCAGAGGAAGGAAAAAAAAATCTTTGAAGAAGAAAATATGCTTGGGTAAAACAGAAAGCAAATGACATCT
TAGTTATACAGGTACTTCTAAAACTTGTGCTATTATATATTATATATATAATTTAAGTTGCACGTGAATT
CTCCGTACTTTGGAATAAATTTTGGAGTAAGTTTTCTTGTTCTTATTCATCTTAAATATAGATCACTATT
TGAAAAAATAAGTGATAACTTGTTTTCTCAAGTAAACAAGAGTGAAACTGACCATAGAGGACTTCAAAAA
AGAAGCAGGGGAGAGGGGAATTTGCAAGATGAGGTGTTTGATAGAAACAAAAAAGTAAGCACTATTTAGG
CAGAGTACTTGTGATGACTAAAGCACAGGATTTGAGCAATGATGGGAACCAGCTCATCTTTAATTTCTTT
ATGCCTCATGGGTAATTTTAGGCACATAAAGGAGCCTCCAGTGAAAATGTCTTCATAATCATCAGCTCAG
TTCAGTCGCTCTGTCGTGTCTGACTCTTTGCAACCCCATGGACTGCAGCATGCCAGGCCTCCCTGTCCAT
CATCAACTTGAGTTTACTCAAACTCATGTCCATGGAGTCAGTGATGCCATCCAACCATCTCATCCTCTGT
TGTCCCCTTCTCCTCCTGCTTTCAATCTTTCCCAGCATCAGGGTCTTTTCCAGTGAGTCAGCTCTTCGCA
TCAGGTGGCCCCAGTATTGGAGTTTCAGCTTCAGCATCAGTCCTTCACAATCATGTTTTATTGTACCCCT
CCATTTTCTCATCGAGCCACACAGGCTTTTCTTTACTTCTCCAAACATTCAAGGTTCCCTCCTGTCTCAA
GACCTTTGCAAATGTTCTGCTTTGCTGAAAGAGTCTCCAGAAACAGCATGGATCTAGGACATCAGGATAA
CCCATGATACAGCTGGTTGTACAGATGTGGGTGTGACACACAGCATCCTAGCAGGGTCACTTCTGGGGGC
ACTAGAACTCACTGTCAGGGAGAGTGCAGGGCGAGGGGATGGGAATCTCAGTAACAGCCACACATCTAGA
GGAGGCATCCAACGAGAGTGAGAAATGGTCACGGAAATAGGTTCTGTCAGCTTAGGGCAATGTCTTGGGA
GCCAAAGGGAAAAACAAGTTTTAAGAAAACAGATACAAGCAGTATTGGACACAGTACTAAGCACAGGGTT
GGCAGGTAGAAAGCACTTCTTTTTTCACCCCCATTCATTCATCCATTCAACCATTCATTCTACAAATATG
CATGTATAGCCAAGTGCCAGTTGCTCATCTCATAGAAGTGAGGAAAACAAATACTATCCCTACTTCATCC
TAAAGCTTATATTCTGGGGGAAGAGGGAGTGGGGTGAAGAGAGACACAAAACAAATTCACAATAAGTATA
TAATATAATATCAAGTGGCAGTAAATGCTATGAAAAATGCAAAGCAAATTAAGGAAATGGAAAGATCCAG
TGGTGCTGCTTTGGGAAGGGGAGGCAGGGAGACCTCTCTGTTGAGGGGTGGGGTGATATCTCAGTGCAGA
CAAGAATGAAATGAGGCCTGAGATGCTAGCAGTTATGATGATTAACAAGATGACCTGAAGATCCTGGTGT
GAATGGTACCAAGGGATGAGAGAGTCCTAAGAAGGAGAAAGTTGTCAACAACTCAAATGCATTGAGATGT
CAAAGTAAAGGAGAGGAAACTGTTTATTGGATTGTTAGTTTTGACCATTAAGGAGGTTAGTGATCCTAAA
AAGAGCAGCTGCATTGACTTGTTTGAGAGACAAACTGCAGATGATTAAGGAGTAAGAAGGTAACAGGAAA
TGGATAGAGGGGAGATATTTTGAAACAGTTTAGCCATGAAGGAGAGTGGTTAAGGGAGTTGCTTAGTGGC
TAAGTTGTGTCCAACTCTTTGGCAATCCCACAGACTGTTGCCAACCAGGCTCCCCTGTCCGTGGATTTCC
CAGGCAAGAATACTAGAGTGGGTTGCCATTTCCTTCTCCAAAGGATCTTCCAGACCCAGAGATCAAACCC
ACGTCTCCTGCTTGGCAGATGGGTTCTTTACCACTGGGCAGATATAAATGATACAAAACTAAATTTTTGT
GCTTCTCTCCCTCCCCCCCGCCCCCCACCCCACCGCCTCGCCCAGACAAGGTTTTCTCAGCCTGGGCACT
GTGGACATTTGGAGCTGGATGATGCTTTGATTTGGGAGTGATAAGGGGTACTGCCCTATGCATTGTAGGG
TGTTCAACAGCATACCTAATCTCTACCTACTAGATGCCAGGAACACCCCTTCCAGCCAGTTGTGACAACC
GAAAATGTCTCAAAGTGTTGCTAAACGTCCCCTGGAAATTAAGTCTCCCCCAGTTGAGAACACATGCTCT
CTACCGAGTTAACCTGTATAGTTAGAAATGATGGATAATGAGCCCATAGAGAGGGAGAGAGAAAAGAGAT
TGAGAAGGGAGGGTCTTAACATGAGGCACCATAGTCAGGAAGTTCAAAACACCAGCAAAGTGAGAAAAAG
CAGAACTCTAAGGACTGCTAGTCAGTGCTGTGTTGTATGAGTTTAGGAACAAATGTTGTTACTCCAAATT
AAGAAGGGAGCATTGAAACTGAAAATCAGAAAAGAAAAAATACATTTGTTGTAACAATTTGGAAATCAGC
ATGCAAATTTTCAAGGAAAAGCAAGATTATTTTTAAAAAGAGAAACCATCAGTTATCTTTTTTACTTATT
AGGAGGATAAGACAAATGAACACACAGTTTGAGTTGGAAAGTAGATCACGTGGTTCATAAATTGCCCCTT
AGCCTGGGAAGTCTTTAACTTGGGATTGAAGACAGATGTTTCTGATGTGACTCATGTATTTTTACCACTT
TCTCATGGGCATGAACTGAGTTTATTCCTTTCTGTCTCCTGGTGATGTCAGTGTCTTGTTTACACAGGAT
GAGCCTGCCTAAGTCAAAACTAGCCCCAAGTCATGTGATGCATGGTAAAAATACAGAGGTGGGTTTCTCT
TTAGATAGAATATAGTTGCAGGCTCTAGGGGAATAGACTTGATTTCTGGGTTGAGTGCATGAATTTTAGA
TATTTGGGAATGTTCTAGAATAACAGAATTTCAGAGTAGAGAAAGATTGCCATGGTTATCTCCACAAACA
TTTTATAATTTCTCAAGGTGATATTTTAAATATTACACATTAAAAGCACCTGAAAGTTGTGCCTCTTTTA
GCGGTAGAAAAAATGTGAATATACTACTTGGCAATCAAGTGTTCTCCCCCCTAAATTAAAATTAAATATT
CATCTACACCTGAAAAGACCACTGGCATTTAATACCATATCAGAGAGAACTATTATAGTTATTTTATGAT
GTGATTTCCCTAAAGCCTCATGACAGTTTAGTGAGTCAGGTATTATTTCATGTGAAAACAAGAAAAACTA
GCAATTGTGCAGAAAATCATACCTAACAAGGTAGAAACAGCTGTTTTGTGTCCAGAATTGATCTTTGGAC
ATCCCTCCCTCAGTGATTTTTTTTTACCTGTTTGATGAGATCTCAACTTCTTTTTAATGGTAAATACACA
AACTAAACCACTTTGGGACTGATATTCGGGCAATTGATTTCATCCAGCCAGACTCTATCCAGGACAGCTG
AGTCAGCAGACATGATTTTTGTCCAGGTCCTGTCTAGCAGAGCACCCTTCAACCTCCCCTGGCCGACCTC
TCTAGCAACACACTCGGGAAATGATCCATCAAGGGTCATATCAGATGTTATGAAGAGGTAATTTTCTCTG
AGATTTAAATTAAGGATTTGGGTTTGGTCACTGGCCTCCATATGAGGAGAATACTTAATGATTTTTTAAT
CAAGTACAGTTTTTGTAAGCAAGCGTATTCTGTTGCAAGATTTTTGGAATTTATTACATCCTAAAGAAAA
AAAAAAACAGCCCTCAGAAAATCATTAATTTCTAACAAGTAACTTATTTCCATTACAAGGACCCTTTCTT
TGAATGGATAGTTCTGGTCTAATCTATATGCTTTCTTAACTTTCATGGTAACTGTTACTGTGCACTTGTT
ATGTATATGTGTGTGTATATATGTATGTATATATGAGTATATACACACACACACACACATCTTCTGCTCC
AAATGTTCTGTTTGTTTGCTTGTTTTCTTTTCATTGGGCTCTTAAGAGCCCCAACAACCCACAATAACTT
GCAGAATCTACTTAGCCCCAGGGAAAAAAAGAGCATCTCTGCTAAGCATCACTTCTCCCAAACATTATAA
GGGGATACACTGAAGTGACTTTAAAGAAAGAAATCTGTGGATAGAAGCTTCAGGATACTCAGTGTAAAAG
CAACCCCTGAGTCTCTAAGACAGCATCTCACCCAACAGAACCAGGATCTCTTCCCGGTGTGTGTGCAGAT
GAAATGATTGGGAAGGCACAGTTGGGTGGGCCAGTGCGGAGAAGATGGATGAGTGGCCTCTGTGTCTGTG
AGGTGGTCAGCGAGAGAGGCAAATCCCACCGCCCCCCACTCCCATAGCTGTATGTGCCACTCTTCCTCTA
TACAAAGACAATTTGCTTTTCATGTAGTAATGAAGAACAATGTTTAACAGACAAAGGAGCTTCTAGTTTA
TTATTGTGCTGTCTTTGACTAGAGCTCGATGCTGGGAAAAATGGTAATTCACAGTGTGAATGCTTCTTTT
CCAGGCTTACGGGTTATGGCTGAGCATGTACATTGTATATTGCAGATGGTACAATAGCGAACACTGTGAG
AACGAGGAAAGAATGAATTACAGCTTTCCCCCCGCCCCTTGCTAGCCAAGAAAACCAGGAATTTCTTCAC
AATGATTTTTATGTATGTTTCTGTGTGGATGGGGTATTCCTGAGCAATATACGCTACTAGATTCTGATTA
ATGATTTTAGGTTTTAGTGTTTTTAAACTAATGTACAAAATCTACCACGATGCTCTTAGAACACCTGTGT
TCTTCTGCTCCCCCCTCCCTCCGCCCCAGTAAACAGACAAGGCACAGGTGGTGCCTCATTAACTTATTTC
ATGCAAATAAAAACATATTGCATGATTTCTTCTCCTTAACTATATAATCACTGAAACATGATCGTGTTGA
TGAACGCAACAAACTTTGGAAGGGGAAGATTCAGCATCTTTTGTTGGTTCTCTTATGCTGAAGGGATTTG
CAGGCCTGGTAGTAAATCTGATTTAAACGGAAAAGTAGAGGGAATGTCTCTTTCCTCTGTCAGAAATCTA
TTTTCTTCTAGTCTTTTCTGCATGTCCTAGGACCCTCTGTAATATAAAACTGGACATATAGAATTGGAGA
AAAAAGGGAAGACCAAATTTCCCATGGAGGTAGAAGAATTGGGAACCCAGATGAAAAGCCATCACAAATA
AGCACGCTTAGGTGTCTTGATTCAGTGTTCAGATATTCTTATCAATGATTTCCAGGGAGTGACACTGATT
TGCTCTGGAAACAATGATAGAAAAATAGTGACTGTGTACTGGAATGCTTTTAGCATTCTAAAATCATATT
GGAAAATGTCCGAGCATTCCAACCCAACTGTAAACAGTACCATGCTGTATTTATGGGGGATTGCAAGTCT
GTTTTTAAAAGTCTTTTGAGATATTATTTCTGTTTTTAATTTATTGAATTAAAGCCAGATTTATTCAATT
AAGATAACACACTTCAAGGTTAACTGCCATTTGCTGTTCATTTAGGCAAGCTATTGTACAGTTATGTAAA
TTTCATGAACAGAGACACTACTTGAGTGTCTTAATTACAAGCAAAAAGTGTCATTTAAATATGAATTAAG
TGAATCGTTGCCAGTATTTCCTGTTCCTGTATTTTGTATCTTCTTAAAATCTAAGGGCTATGCTATGATT
GGTACTCTGTCCTAAGATTCTTTTTCCTATAAAAGAAAAACACTCGTACTATTAAATATTATTTTTCAAA
GTAGGAAAACATGTAAAAATGCCACGATCTCATTGATCATGAATCATAGAAATATCTGAATAAGAAGGTG
ACAATTTTTTCCAGGTGACCACTGACAGTCATCTTCACATTGTTACTGAAATTTAAAAAGAAAAAAAAAA
TTCCATCCTATACTAGGGGGATAATTTGCTACATGTGTCATTTGCTCAAATTTACCCTCAGTGTGCATTC
ATAGTTTAAAAAATACTCTCTTTTAAATCTACCCATTTATAATGGATCCAAACATTCAGTTCTCCAACTC
AAACACTGGGATGAGCACAATTTTACCACTTCTTTAACTTTGTTGAAAGGAACTTTTATTTTAAACTTAG
AACTGTCATGTCTATAGTTGGGCTTAGCAAAAGGGACCCACAATGACCTAATACTTTGTGGATTTCTGTG
AAATAATTCCCCGTGCATCTCGGACGGACACATTTTTTTGTGTGTCGCTGTATTTGTTGGATACTTCTTG
TGTATTAAAACGATCTAGAGACCAATCTGGCATTCATCAGGACTCGGAAGGGAGGATTCACAGTCCTGTT
CCTGAAAACCCCACCTCACTTCTGCGCAGTCGGTTCCCAGCTCCACTGCTGAAGCCTGGGAAGTTATTTC
CAGCCGCCCCCCGGCCCCCTCTCCAAGGTAGCTCCTCCTCGCTCCCCCAGGAGTGAGAAGTTCACTATTC
AACTCAGTGCTTGCATCATCCTTTCCAACATGTTCCCACTGATACATTAAAGGAATTAATTGCTCATAAT
GATCTCACCATCTGTTAACAGAGGTTCATACCTACAGGCTGGTGGCATGCGGTGGTTCTATTCAGGGATC
AATAGGCAATGCCGGGCGGACGCTTTTCCCACATTTGGCCGCCTCTGCCCGCTGAAAGGGGAGGAGAGCT
CCGTGCCGGCCATCTGCTCCCTGTCTTTCTGTGCGAGACCTTGATGCGGTCCAGCACAGCTCTGATCTGA
CTCCAGTCCGATTGGAATGTGGCTGATCTGAGAGCCTCTCAAAGCTGTGAAAGTGAGTCAAAAGGAGATG
CCCATTGTCCGGCAGGGCTGAAAGAAAATGATCACTTTAAAATTCATAACTCCCAGGTGCCTGCCATCTA
TTCATGGGAAAAACCCTCTGAACTTTCTCTTCATTAAAAGAGAGAGAGAGGGGAGTCTTGCTAAAAAGAG
AGAAAAGGAAGTAGTTTGTGGGAATGTGCAGGGGATTTTCAGGCTAAACAAATGAGGAATTTGGAATTTT
TGCACTTGTATCAAAAGGGCTGTGTAGGGAAAGGGGAAGCTCTTATTTTTAAAAAAGGGCAGGGAGTGGG
TTGGGTGGGGGGATTGGGGGGATGATTGGGGGGATTCCTCCCTGGGGCACTGTAGCAACTAATAATTTTT
TTTCTCCAGGAAAATTCAAAAATCACTCTGACCTTTTTACTTTCATTACCTGAAGCTGAGAAAAAAAAAA
TTTTTTTTTTATTCTGGACAGCATTCTTAGTTGATTTCTGTTCTATTCTGGATAACATTCTTAGTCGATT
TCTATTCTTATTCAACTGGGGTGCAGTTAGGGTGGGAGCTTAGGGAAGAATGAGAGTGGGAAATTCCTGA
ATAGGTACAAGACATCAATGGACACGCAAATGAAACTCAAATCTACCAGGTGTTTCTAGGGAGGGATTTA
TTCAGCAGGATTACCTGCGCCCACCAAGATATCTGTCAGGAGCAGAGCTCTCACTCAGCCTCAATCAAAA
GAGGAGCCATGAGGAAATAGTAAAGAAAAAAAGTTTTTAAATCTCTATGTCTAGGAAACTGAACATTCAG
CTAAGTTCCCGAACTGCAGGAACCGCAGGGCTGTTGTATTGACTGCTCTTGGCTATTGGTCATTGTAGAA
AACTGATATTGTATCAGAGACACTTAAGAATTCAGAGAGGATTTGTGGGTTAATGAATTGAGAGCCAAAC
AAATGTATAAACAGAAAGACCTAAGTTCTGTCTCTTGAAAATCTTTTATCTTACAGATACTGACAAGGCC
CCTAGAGGTTAGGTGGCACCAGTGGAAAAGAACTTGCCTGCCAATGAAGAGACATGAGAGATGCAGGTTT
GATCCCTGGGTCAGGAAGATCCCCTGGAGAAGGGCATGGCAACCCATTCCAGTATTCTTGTCTGAAGAAA
TCCATGGACAGAGGAGCATCAGGACTATAGTCCACAGGATCGAAAAGAGTCAGACATGACTGAGCAACTT
AGCACAGAGGTTAAGTGATAGGCTCAAAGTCATAAAATTTGGGCTATAACCTTACCTGTGCAATTTTCTT
TCTACTGCTTCATGCTACCTCTCTCAATGGGGAAAGTTTTAAATTATTTTTGCTCAACTTTCCCTAATTA
AAATATACATTAGTATTACCAAACTAAATAATATAATAGTTTTTTTTTAATAAATTCATGAGGTCATAGG
AAGATGATGAATACCAGTACTAATACATAAAGAATGGAATATTAATCAAGCAAGGCCTTAATATCATATG
GAAAATATAGCAAAGATGGCAGAACCTACTGGGGGAACAGGGTAGTAATGTCCTGTTGGACATCAATGAC
CAAATAGAGGAAGAAGGAGGCAATCGTCCATTCAGAGTGAAAAACCACCTCCTTAACAAGCCAAGAGTCA
AAAGAAGTGAATGGCAGAAAGGAGATGGAAGAACCTAGAAAAAATTCAAATTCTGAAAATGAGAGAAGAT
ATTTATATCAGCAATAATTGCATCTCTGATAAACCTCTTTGGCTATGATTTTGCCACTGTGCAAAGCTAG
GGAAAAGATATTTCTGACAGAATCTGTGTTGCTTTCCTAACAAAGAAGAGGCCACACCTTGACCATATTT
TTTTTTCCTAGAAATTTAGATTCAGTTTATTAAGAATCTTTGTTTTGCTTGACTTTTACTTTGTGGAAAC
GTATCTTTAATTTTTAGGGCAGTGGATGGGAACATAAAATACACTGAATAGAATTTTGACTTTTTTTTTT
TAGTATTTTGACATTGTTTTTATAGTAGGTTGAAGTGAGAAAAAGCCATCAAGGTCAGTTGTAGATTTGA
TCACTAATCCAATTTAATTCTTTTTTTTTTTTTCTTATTAAATTTAATTCTTAAGTTATTAACTTGTCTA
CTCTACAGTTTATGTGGATAAAGTCAGAATTAGTGAGATATGGTTTGGGAAATAAAACTCATGAGATTAC
TCCAGAAAAAGTAAAAAAAAAAAAAAAAAAAAAGGTAAACCAATAGAAGAAATGAATTAAAAAGTTACCC
ACAATCCCCTTACCAGGGAAAACTACTGATAATTTTCAGAGTGTATATTAAACAACAATAAAGTGAGATC
ATATTATTCACATTATCTTACATAGTATTTAAGTTGAAATTATAGACTTTTGACCATATTTGACCTGCTG
CTGCTGCTAAGTCGCTTCAGTCGTGTCCAACTCTGTGCAACCCCATAGACGGTGGCCC
>gi|546669925|gb|AWWX01450616.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig450615, whole genome shotgun sequence
AGGTGGGAGGAGGGTTCAGGATGGAGAACACATGTACACCCGTGGCGGATGCATGTTGATATATGGCAAA
ACCAATACAATATTGTAAAGTAAAAATATATATATATATATTAAAAAAAATAAAATGTTAAAATGAAAAA
AAAAAAAAAAATCTCACCCAGAGAGGCACCAGGATTGGAGTCCAGAGAAAAAAGAAGAGAAAAAAAATCA
CTTGGGGACATAGCAAGAAGGTGGCCATCCTCAAACCAAGGAGAGAAGCCAGAAGAAACCAAACTTTCCA
ACACCTTGATCTTGGACTTCTAGCCTCCAGAACTGTGAGAAAATAAATTTCTGTAGAGTCACCCAGTCTG
TGGTATTTTGTTATGGCAGACCTAGCAGACTGATATGCTCCTTAAGGCAAGATGTTTGCTCCTCTGAAAT
CCAGCAGCATTCTGAGCATGTGCCGTTTTAGCACTTATCACAGCTATTAATATTTTACAGTCATCAGTTT
ACTTGTGCATCTCCCTAACTGGTTTGCAAACTTTTGATGGAATGATTCTATATATTCATCTGTTTCATTA
ACCTCTTGCAATTTGTTTGGAAATTTGTAAGAAATAATATAAAGGCCTTTAGAATGAGTTAATTTCTTGT
TCACAATGTGCAGCTTGCTGATAAATAAGCAATGAGGTTTTCCTTGTTGTCTAGCCTCCTGACATTTCTG
CCCAGGGTGCTAAACAGGAAGTATTGATTTTGATATTGGATTTCATTATTAGAGGCTTTGCCTTGAATTT
TTTTATATTGTTGTGTGAGTACTACTTAAAGATCAATTTTCTGTACCTTTACATATTTTCTAAAGTGCAG
AAGATGAAGATGGGTTTTACAACAAAGATAAGATCTTTCAGTAAGTCAGTGAGGTTGTTGTAGGTGGATT
GTTTTTTGTTTTTATCCACATTATAATAGCCTCCTTTTCCCAAGAAATACCCCTTTATGCAGGAGTAGGT
TTAACATAGTCTACATGCCATGTTTGGTTGGACTAAAGTTGTTACTAGATCCACATTTGACTAGTAAGAT
GCTATCCTCTCAATTTTGAATTAAGAGTGAGAGAAGTTAATCTTTGTCCCTTTTGAGTGACTCAAACCAA
GGAGAGTATAAACTCCAGAATTGTGAAACAGTCTTGTTCCTGGAGAAACAGAGACATCTGACTAAAGAGC
AGGATATATCACAAATACAAAAGATTTGCAGAGAAAAAGATTTTGTGTAGCTACAAAGATAGAGTGTTAG
AGAGAGAAAGGCATCAGTAATCCCAATTAGTTTTCATTTCCTCATTTCAGTTCCTTGTAAGGTCCTCTTG
CTTTCCTTGTTCTTGGATTCCATGATTTTCTACTGCTGGTACTGAGTGTAGAAGTCAAATGTTGCCTAGT
ATGTGATTTTGACGGGCATCCTGGGCAGCTCAGTGGTAAAGAATCCACTTGCCAATGCAGGAGACTCAGG
TTCGATCCCTGGGTCAGGAAGATCCCCTGGAGAAGGAAATGGCAATCCATTCCAGTATTATTGCCATGGA
CAGAGAACCCTGGCAGGCTACAGTCCATGGGTTGCAAAAGAGTTGGGCATGACTTAGGACTAAAAGCAGT
GTCATTTGGGGGAATATTTTTTTTTTTTTTTTACAGCCTTACCTGTTCCCTAGAACATTCTGCCAAATCA
GGGCTTTATCACCTGTGGATTTTATACTATTCCTTACCACCATGGCCAATCAGTTATAATGTCTGTCCTC
CAAATGCCATGCCCTAAGTAGTTTTCCTCCCCCGTCTCACTACCACTGTCTTAATGAGAACTCTCCCGTC
TGAACCAGTGTGATAGTCTCTTGATGGATTACCATGACTCTAGACTCTTTCTTCTCCAACTCATTTAACA
CAGCATAATGAGATACGTCTATCTAACTATGGTTTCTGGCTGTGATTGTCATACAAATTATGGATAACAT
ATGTAAGGGAACTAACACTGTGCATGGCATATCATTGAATATTAAATTCTCCACTACTGTGAAGATAGCA
CAACAATATTAACAGCGACTTTTTTTTTTTTAGTACTGACTATGTATTTGGTGCTTCACTTTATCTAATT
TACCTTTTAATAACTCTTTGAAGAACTATTACAAGTGAAGAACCCATCTGGGTTTGATTGCTGGGTTGGG
ACAATCCCCTGGAAGGCATGGCAACCCACTCCAGTATTCTTGCCTGGAGAATCCCATGGATAGAGGAGTC
TGGCAGGCTACAGTCCATGGCGTTGCAAAGAGTCAGACATGACTGAGTGACTAAGCACACAGCACAGTCA
TGTAATTACCTAGTTGCTGAGATTATATTCAAACCCAGGATTCCTAACTTCCAAGTATTTGCTTAACTTC
CATGCTATTATCTATTGCATCTCTGTATAATTCTTGTTTTCTTAGCTTGAAAACAAACTAAACATGGACC
CTTGTAGCTTGGAGAAACTAATAGAAGTGGGACACAAAATCTGATCCAAGAAAAAGCACCCCAACAGCCT
GCTGCAGATCCTGTACTTAATGTCCTTACAGCCTTGCAGTTTTGTGAAAAGCAACACTTCAGAGATTTTG
CTCTCCCTGAGAACTGTGAAACGTGGCCTTTGCTAGAGAATTGATGTTGTTTAGCCACTAAGTTGTGTCT
GACTCTTTTGCCACCCCATGGAATGTAGCCTGCCAGACTACTCTGTCCATGGGATTCTCTAGGCAAGAAT
ACTGGAGTGAGTTGCCATTTCCTTTTCCAGGGGATATTCCTAGCCTACGTCTCCTGCATCTCTTGCATTG
GCAGGCAGATTCTTTACTACTCAGCCACCTGGGAAGCAGAACGGCTGCAGTCAAATACAGACTGTGTCAC
TGCCTACCAAGTATGTGTCTATGCATATTAAATCCAGACAAAAGGATTTCAACAGTTGAAATTGGAGTGC
TGTCCAAAAACTTGGGGGTCACAGAACAATAGATCTTATACAATCTAATCTGATTTGACATAGGTTCAAA
TGTTTTTATATCAAAGTTTACATTATCATGCAGTGGTAAGACTGTATAGATCTGGCTATGTATTGCTATT
ATTCGCTTCATGTTAAATTAAAAAGAAAATATTCACAAAGAAGTAATCCCTTCTTCACAGAAAAAGCTAT
ATTAATCTTTTAAAATATATGATTTATAAAAGTCCATGAAAAACACAATTAATGGCCCTTTGAAAATCTT
ACTGTGTGGTGTGAAATGCACTTTTCCTATCATGGAGAAGGGATTACTGTTGTCCATATTTGCCTTGAAA
CCTCTATCCACAGTCCACTGTTGTTGATACCTCAGAAGGTATAAACTACTAAATATTATTGAATTTAGAT
GGAATGGATTCAGTAATGCAAAAATAAAGATTTCAAGTATACAGCCAACACTGGAAAGGGTCTTAGGTAA
AGGAGATCCTCAAATGACTTTCAGATATGTCATGATTTCTGTGGAGACAGTAGCCAATTATGGGGTGAGA
GAGAACTGAGAAGTAATCTTAGGATTGTACTGGGGTCTGCTTTTCCTAGAACCTTCATGGGTAGAGTTGC
TGAGGACCCACATTTGAATAATCAGACACTGCTATCTGACTGTTCCCATCAGAACCTATCCCCTTTCTCT
ATATATTTTTTCTTTACCTTGGATTTTTAAAAATTTATTTTAATTGGAGGCTAATTACTTTACAATATGG
TGGTGGTTTTTGCCATACATTGACATGATTCAGCCATGGGTGTCCATGTGTCCCCCATCCCAAGCCCCCC
TCCCACCTCTCTCCCCATCCCATCCTTCTGGGTTGTGCCAGTGCACTAGCTTTGAGTGCCCTGTTTCAAG
TGTCGAACTTGGACTGGCCATCTATTTCACATATGGTAACATACATGTTTCAATGCTATTCTCTCAAACT
ATCCCACCCTTGCCTTCTCCCACAGAGTCCAAAAGTCTGTTCTTTATATCTGTGTCTCTCTTGCTCTCTT
GCATATAGGGTCATTATTACCATCTTTCTAAATTCCATATATATGCATTAATATACAATATTGGTGTTTT
TCTTTCAGACTTACTTCACTCTGTATAATAGGCTCCAGTTTCATCCACCTCATTAGAACTGACTCAAACT
GGAGCCTATTATACAGAGTGAAGTAAGTCAGAAAGAAAAACACCAATATAGTATATTAATACATATATAT
GAAATTTGAGACATTACTTTGCCAACAAAGTTTCGTCTAGTCAAGGCTATGGATTGTTCCTGTGGTCATG
TATGGATGTGAGAGTTGGACTGTGAAGAAGGCTGAG


file2 is result.ods

Code:
subject id	 s. start	 s. end
gi|546669925|gb|AWWX01450616.1|	282	305
gi|546671471|gb|AWWX01449637.1|	771	790
gi|546669842|gb|AWWX01450698.1|	1523	1542
gi|546669842|gb|AWWX01450698.1|	1641	1660
gi|546671514|gb|AWWX01449617.1|	1926	1948
gi|546669842|gb|AWWX01450698.1|	2484	2503
gi|546669842|gb|AWWX01450698.1|	2720	2739
gi|546669842|gb|AWWX01450698.1|	2725	2744
gi|546669977|gb|AWWX01450566.1|	2822	2842

output:

and i want to extract region from like 282-305 from seq gi|546669925|gb|AWWX01450616.1| from file1 i.e smalldata.fasta.
i.e output should be like
23 charactes small string. (305-282=23)

moreover i also want to extract region 100 charactes back from 282 and 100 charactes forward from 305
i.e result should be like
100+23+100 characters long string i.e 223 character long string

the result file should be separate file from two input files

I shall be thankful to you if script made by you works for these two files i.e file1=smalldata.fasta
file2=result.ods

Thanku Smilie

Last edited by Scrutinizer; 08-06-2015 at 04:14 AM.. Reason: CODE tags
# 2  
Old 08-06-2015
Please also post an output file sample.
# 3  
Old 08-06-2015
it should be excel file, and should be like

Code:
column1      column2                         column3
seq id          23_character_seq           223_character_sequence

Moderator's Comments:
Mod Comment Please use CODE tags

Last edited by Scrutinizer; 08-06-2015 at 04:30 AM.. Reason: CODE tags
# 4  
Old 08-06-2015
I mean an output sample, not an output format. And please use code tags.
# 5  
Old 08-06-2015
And make sure that the output you show us includes the (exact) output you want produced for at least the following file2 input lines:
Code:
subject id	 s. start	 s. end
gi|546669925|gb|AWWX01450616.1|	282	305
gi|546669842|gb|AWWX01450698.1|	1523	1542
gi|546669842|gb|AWWX01450698.1|	1641	1660
gi|546669842|gb|AWWX01450698.1|	2484	2503
gi|546669842|gb|AWWX01450698.1|	2720	2739
gi|546669842|gb|AWWX01450698.1|	2725	2744
gi|546669977|gb|AWWX01450566.1|	2822	2842

Note that the 1st line is this file2 is related to one entry from file1, the next 5 lines from this file2 are related to another entry from file1, and the last line from file2 is related to an entry that is not found in file1.

Is the output for the 5 lines related to the string gi|546669842|gb|AWWX01450698.1| supposed to generate 5 sets of output OR is the output for those 5 lines supposed to be combined into 1 set of output duplicating some of the output (due to overlapping ranges) OR is the output for those 5 lines supposed to be combined into 1 set of output containing the non-overlapping regions of thee requested ranges 1423 through 1760, 2384 through 2603, and 2620 through 2844 (where the start and stop points have been extended 100 characters in each direction and the five overlapping input regions in file2 have been combined into a three non-overlapping output regions)?

And, for the last entry in file2, there is no entry in your sample file1. Is anything supposed to appear in the output for this case? If so, what?

And, just for the record, the number of characters specified by the range 282 through 305 is 24 characters; not 23. (If you don't see why that is true, take the simpler example where the range 282 through 282 is 1 character; not 0.)

Last edited by Don Cragun; 08-06-2015 at 06:24 AM.. Reason: Fix typos in counts.
# 6  
Old 08-07-2015
sir that is excel file how can i post here?
however it is roughly like

Code:
sequence id	extracted region small	extracted region big upstream and downstream
gi|546669925|gb|AWWX01450616.1|	CACCTTGATCTTGGACTTCTAGC	"CCAGAGAAAAAAGAAGAGAAAAAAAATCACTTGGGGACATAGCAAGAAGGTGGCCATCCTCAAACCAAGG
AGAGAAGCCAGAAGAAACCAAACTTTCCAACACCTTGATCTTGGACTTCTAGCCTCCAGAACTGTGAGAA
AATAAATTTCTGTAGAGTCACCCAGTCTGTGGTATTTTGTTATGGCAGACCTAGCAGACTGATATGCTCC
TTAAGGCAAGA"

---------- Post updated at 03:30 AM ---------- Previous update was at 03:01 AM ----------

cragun sir, last entry yes i want to generate 5 set of output correspond to each gi|546669842|gb|AWWX01450698.1| entry. though it is occurring multiple times but positions are different so there should be 5 lines of result in the result file correspond to gi|546669842|gb|AWWX01450698.1| entry.

and last entry is there in file1 see entry no 4.

---------- Post updated at 04:27 AM ---------- Previous update was at 03:30 AM ----------

even now any problem exists sir?

---------- Post updated 08-07-15 at 12:23 AM ---------- Previous update was 08-06-15 at 04:27 AM ----------

hello, i am waiting for your answer sir. Smilie
# 7  
Old 08-07-2015
Try this, based on the good work of Scrutinizer in this post
Code:
awk '
BEGIN           {print "sequence id\textracted region small\textracted region big upstream and downstream"
                }
NR==FNR &&
FNR>1           {CNT[$1]++
                 S[$1,CNT[$1]]=$2
                 E[$1,CNT[$1]]=$3
                 next
                }
                {split ($1, T, " ")
                }
T[1] in CNT     {i=T[1]
                 $1=x
                 for (j=1; j<=CNT[T[1]]; j++)
                        print RS i "\t" substr ($0,S[i,j],E[i,j]-S[i,j]+1) "\t" substr ($0, S[i,j]-100, E[i,j]-S[i,j]+201)
                }
' file2 RS=\> FS='\n' OFS= file1

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sequence extraction

i want to extract specific region of interest from big file. i have only start position, end position and seq id, see my query is: I have file1 is this >GL3482.1 GAACTTGAGATCCGGGGA GCAGTGGATCTCCACCAG CGGCCAGAACTGGTGCAC CTCCAGGCCAGCCTCGTC CTGCGTGTC >GL3550.1... (14 Replies)
Discussion started by: harpreetmanku04
14 Replies

2. Shell Programming and Scripting

Parsing and masking regions from a single fasta file with subsequence

HI, I have a Complete genome fasta file and I have list of sub sequence regions in the format as : 4353..5633 6795..9354 1034..14456 I want a script which can mask these region in a single complete genome fasta file with the alphabet N kindly help (2 Replies)
Discussion started by: margarita
2 Replies

3. IP Networking

Newbie BIND DNS question: resolving upstream hosts?

Old skool UNIX and Linux geek here, but newbie to the world of DNS and bind. I've recently been tasked with replacing our DNS infrastructure, currently on Windows, with a RHEL based solution. And I assume that means using bind, which I've not used before. Here's my question: Suppose our company... (3 Replies)
Discussion started by: lupin..the..3rd
3 Replies

4. Shell Programming and Scripting

Obtain the names of the flanking regions

Hi I have 2 files; usually the end position in the file1 is the start position in the file2 and the end position in file2 will be the start position in file1 (flanks) file1 Id start end aaa1 0 3000070 aaa1 3095270 3095341 aaa1 3100822 3100894 aaa1 ... (1 Reply)
Discussion started by: anurupa777
1 Replies

5. Shell Programming and Scripting

FILE_ID extraction from file name and save it in CSV file after looping through each folders

FILE_ID extraction from file name and save it in CSV file after looping through each folders My files are located in UNIX Server, i want to extract file_id and file_name from each file .and save it in a CSV file. How do I do that? I have folders in unix environment, directory structure is... (15 Replies)
Discussion started by: princetd001
15 Replies

6. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ... (1 Reply)
Discussion started by: manigrover
1 Replies

7. UNIX for Dummies Questions & Answers

extract regions of file based on start and end position

Hi, I have a file1 of many long sequences, each preceded by a unique header line. file2 is 3-columns list: headers name, start position, end position. I'd like to extract the sequence region of file1 specified in file2. Based on a post elsewhere, I found the code: awk... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

8. UNIX for Dummies Questions & Answers

fast sequence extraction

Hi everyone, I have a large text file containing DNA sequences in fasta format as follows: >someseq GAACTTGAGATCCGGGGAGCAGTGGATCTC CACCAGCGGCCAGAACTGGTGCACCTCCAG GCCAGCCTCGTCCTGCGTGTC >another seq GGCATTTTTGTGTAATTTTTGGCTGGATGAGGT GACATTTTCATTACTACCATTTTGGAGTACA >seq3450... (4 Replies)
Discussion started by: Fahmida
4 Replies

9. Shell Programming and Scripting

awk: union regions

Hi all, I have difficulty to solve the followign problem. mydata: StartPoint EndPoint 22 55 2222 2230 33 66 44 58 222 240 11 25 22 60 33 45 The union of above... (2 Replies)
Discussion started by: phoeberunner
2 Replies

10. Programming

selecting rows with specific IDs for downstream analysis

Hi, I'm working hard on SQL and I came across a hurdle I'm hoping you can help me out with. I have two tables table1 headers: chrom start end name score strand 11 9720685 9720721 U0 0 + 21 9721043 9721079 U0 0 - 1 9721093 9721129 U0 0 + 20 ... (2 Replies)
Discussion started by: labrazil
2 Replies
Login or Register to Ask a Question