Hi Everyone,
I have an input file in the following format:
score.file1.txt
HTML Code:
contig00045 length=566 numreads=19 1047 0.0
contig00055 length=524 numreads=7 793 0.0
contig00052 length=535 numreads=10 607 e-176
contig00072 length=472 numreads=46 571 e-165
contig00019 length=667 numreads=5 474 e-136
I've a second file:
data.file1.txt
HTML Code:
>contig00045 length=566 numreads=19
GGGCTGACGTGCCGCTAATACGACTCACTATAGGGAGAGCATAAAACACG
CCTCCTGAGCTGCAGCAGAAAAAGAGACTCCCCTTGAGCTTTCAGATTGA
>contig00055 length=524 numreads=7
GGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGGAGGGAGGAT
GCTGGAC
>contig00052 length=535 numreads=10
GGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGGGATGTCCAC
AGGCAGAGGGATGTCCACAGGCAGAGGGATGCCACAGGCA
>contig00072 length=472 numreads=46
TTTAGCTGCTTTCCCCCGGAGGAGATTTGAATTCCGGTGAAATCCAGGCT
TTGTTCATTTTAATAAGCGTCAGCCTGTCAGCGCTGTCAGTTGACAGGCG
>contig00019 length=667 numreads=5
TATAGGGAGAGTGGCATTCTAATAACAGGGGACGGGGGCAGAGGACTCTC
GCTGACCGTCCCATGTAAGGGTGGTGTCGGAT
This file contains a header (>contig00045 length=566 numreads=19) followed by few lines of sequence.
In the first file (score.file1.txt), for each row the fourth column is score1(1047, 793, 607,571 etc.) and 5th column is score2 (0.0, 0.0, e-176, e-165 etc.).
I would like to extract those TWO data (from data.file1.txt) based on TOP score1 and if their score2 is NOT 0.0.
For example based on the above data, my desired output is:
output.file1.txt
HTML Code:
>contig00052 length=535 numreads=10
GGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGGGATGTCCAC
AGGCAGAGGGATGTCCACAGGCAGAGGGATGCCACAGGCA
>contig00072 length=472 numreads=46
TTTAGCTGCTTTCCCCCGGAGGAGATTTGAATTCCGGTGAAATCCAGGCT
TTGTTCATTTTAATAAGCGTCAGCCTGTCAGCGCTGTCAGTTGACAGGCG
Thanks in advance.