Grep Problem


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Grep Problem
# 1  
Old 01-20-2015
Grep Problem

Dear all,
my question is about grep.

i have two files.

the first one is a list with ids (line number 188) like this:
Code:
comp10045_c0_seq1
comp10118_c0_seq1
comp10221_c0_seq2
comp10503_c0_seq1
comp10504_c0_seq1
comp10504_c0_seq2
comp10631_c0_seq1
comp10631_c0_seq2
comp10646_c0_seq1
comp10804_c0_seq1

the second file is a tab seperated table (line number 329) , like this:
Code:
comp10045_c0_seq1    tr|Q6PBS3|Q6PBS3_DANRE    79.31    464    93    1    2024    633    2    462    0.0     776
comp10118_c0_seq1    tr|Q6P0C4|Q6P0C4_DANRE    44.19    43    23    1    1066    1194    177    218    0.29    35.4
comp10221_c0_seq2    tr|Q7T340|Q7T340_DANRE    95.05    303    15    0    175    1083    1    303    0.0     543
comp10503_c0_seq1    tr|F1Q9U6|F1Q9U6_DANRE    69.07    472    142    2    2330    915    1    468    0.0     687
comp10504_c0_seq1    tr|F8W2T4|F8W2T4_DANRE    98.01    151    3    0    787    1239    6    156    2e-99     307
comp10504_c0_seq2    tr|F8W2T4|F8W2T4_DANRE    98.01    151    3    0    1095    1547    6    156    2e-98     307
comp10631_c0_seq1    tr|B0S5C5|B0S5C5_DANRE    90.61    394    37    0    145    1326    1    394    0.0     687
comp10631_c0_seq2    tr|B0S5C5|B0S5C5_DANRE    90.61    394    37    0    145    1326    1    394    0.0     687
comp10646_c0_seq1    tr|F1QDS2|F1QDS2_DANRE    84.82    850    110    7    173    2695    1    840    0.0    1333
comp10804_c0_seq1    tr|Q7ZUP6|Q7ZUP6_DANRE    93.58    452    26    2    156    1511    1    449    0.0     851
comp10991_c1_seq1    tr|E7F8T6|E7F8T6_DANRE    94.21    484    28    0    1974    523    9    492    0.0     943

i want to grep the lines of the second file which have the same ids of the first file using the following code:
Code:
 grep -f file1 file2 > output

but it does not work cause i have 329 lines instead of 188.

Can anybody explain what i am doing wrong?

Regards
# 2  
Old 01-20-2015
Hello kleinstadt,

Welcome to forums, thank you for using code tags in your posts for commands.
Following may help you in same.
Code:
awk 'FNR==NR{X[$1]=$1;next} ($1 in X){print $0}' file1 file2

Output will be as follows.
Code:
comp10045_c0_seq1    tr|Q6PBS3|Q6PBS3_DANRE    79.31    464    93    1    2024    633    2    462    0.0     776
comp10118_c0_seq1    tr|Q6P0C4|Q6P0C4_DANRE    44.19    43    23    1    1066    1194    177    218    0.29    35.4
comp10221_c0_seq2    tr|Q7T340|Q7T340_DANRE    95.05    303    15    0    175    1083    1    303    0.0     543
comp10503_c0_seq1    tr|F1Q9U6|F1Q9U6_DANRE    69.07    472    142    2    2330    915    1    468    0.0     687
comp10504_c0_seq1    tr|F8W2T4|F8W2T4_DANRE    98.01    151    3    0    787    1239    6    156    2e-99     307
comp10504_c0_seq2    tr|F8W2T4|F8W2T4_DANRE    98.01    151    3    0    1095    1547    6    156    2e-98     307
comp10631_c0_seq1    tr|B0S5C5|B0S5C5_DANRE    90.61    394    37    0    145    1326    1    394    0.0     687
comp10631_c0_seq2    tr|B0S5C5|B0S5C5_DANRE    90.61    394    37    0    145    1326    1    394    0.0     687
comp10646_c0_seq1    tr|F1QDS2|F1QDS2_DANRE    84.82    850    110    7    173    2695    1    840    0.0    1333
comp10804_c0_seq1    tr|Q7ZUP6|Q7ZUP6_DANRE    93.58    452    26    2    156    1511    1    449    0.0     851

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 01-20-2015
Hi Singh,
thanks for your solution and fast reply. the code you postet works.
But why does grep not work with my files? i know it should work...
# 4  
Old 01-20-2015
Hello kleinstadt,

You can check here about carriage charaters if they are there, by cat -v Input_file if they are present then you need to remove them. I tested the same command in my GNU grep and it worked fine for me, you can also check your grep version by grep --version too.


Thanks,
R. Singh

Last edited by RavinderSingh13; 01-20-2015 at 11:23 AM..
# 5  
Old 01-20-2015
My first guess would be DOS <CR> line terminators. How were those files produced? If I add <CR> chars, the grep -f fails, without it works. RavinderSingh13's proposal will also fail with <CR>s present (sub (/\r$/,"") could help in some awks).
Remove them with e.g. tr -d '\r' <file1 or dos2unix file1.

Last edited by RudiC; 01-20-2015 at 11:48 AM..
# 6  
Old 01-20-2015
Quote:
Originally Posted by kleinstadt
But why does grep not work with my files? i know it should work...
As suggested, your BLAST table and/or your sequence list may contain carriage returns, which can be removed by any of the above methods or

Code:
sed 's/\r//g' file

Also your grep would work, I suggest using Ravinders script or
Code:
grep -w

to avoid erroneous calls.
# 7  
Old 01-21-2015
Could it be that grep has a problem if there are more lines with the same id in file 2?
Also i could not find any carriage characters...

Last edited by kleinstadt; 01-21-2015 at 11:42 AM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

sed Or Grep Problem OR Terminal Problem?

I don't know if you guys get this problem sometimes at Terminal but I had been having this problem since yesterday :( Maybe I overdid the Terminal. Even the codes that used to work doesn't work anymore. Here is what 's happening: * I wanted to remove lines containing digits so I used this... (25 Replies)
Discussion started by: Nexeu
25 Replies

2. Shell Programming and Scripting

Grep problem

Hello. I'm learning shell scripting right now and I'm trying simple scripts. I have a problem with the one below. The idea is to found the number of files containing asd in their name. Here's the code: #!/bin/sh pattern=asd total=0 for f in * do && continue if grep $pattern $f >... (3 Replies)
Discussion started by: EmZvr
3 Replies

3. Shell Programming and Scripting

grep problem

Hi everyone i am facing a strange problem in grep below is the code RC=0 grep $ERROR_MASK $LOG_FILE 2>&1 > /dev/null && RC=1 || RC=0 what does the above statment do i mean it search for error mask into log file and redirect the error to console null then what's the meaning of RC=1 ||... (2 Replies)
Discussion started by: aishsimplesweet
2 Replies

4. UNIX for Dummies Questions & Answers

problem with grep

Hi I am trying to grep a string in a directory which is having 5000 files. PA>> grep -iH 'linc.unisure_pr13n' * bash: /usr/bin/grep: Arg list too long I am getting the error. i also tried PA>> ls -ltr | grep -iH 'linc.unisure_pr13n' * but still i am getting the error Please Advice... (4 Replies)
Discussion started by: subhotech
4 Replies

5. Shell Programming and Scripting

problem with grep |

Hi, When i try this it is not executing either result or total, pls can any one help me in this. max=month_134.log grep result|total $max > log.txt In month_134.log, it should contain either result or total and then send it to log.txt.It should execute result or total in log.txt ... (3 Replies)
Discussion started by: NehaKrish
3 Replies

6. Homework & Coursework Questions

Grep problem

Changed Problem its a sed actually... Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Use a sed command to create an output file from the input file. The input... (2 Replies)
Discussion started by: skalidindi
2 Replies

7. Shell Programming and Scripting

GREP problem

I have a problem. Suppose I have a log named transport.log >> tp finished with return code: 203 meaning: wrong syntax in tp call .... tp finished with return code: 0 meaning: Everything ok. << What i want to do is to get the return code of the 1st one. So I would probably do:... (2 Replies)
Discussion started by: kdyzsa
2 Replies

8. UNIX for Dummies Questions & Answers

Grep problem

Hi there, this is a very simple question. When I do grep -c "PWD" filename, I get 5. This means the filename contains 5 lines of "PWD" occurence. I tried to assign to a variable. int= grep -c "PWD" filename When I typed echo $int The output: 5 I get an empty line space. Now I... (1 Reply)
Discussion started by: felixwhoals
1 Replies

9. Shell Programming and Scripting

Grep Problem

I have a large file that I am grepping to find a certain string. grep 'C:\Data\Directory\Test.txt' test.txt It can not find it even though I know it is in there . I know that there is a problem with the backslashes but I can't get it to work. I tried grep... (2 Replies)
Discussion started by: lesstjm
2 Replies

10. UNIX for Dummies Questions & Answers

grep problem

Hi I would like to know wheather there is anyway to pass a variable to grep argument like this grep "$myvar" $myfile i would put this in a function and then call this function by setting different values to myvar and myfile. is this possible at all. regards Hrishy (4 Replies)
Discussion started by: xiamin
4 Replies
Login or Register to Ask a Question