Find EXACT word in files, just the word: no prefix, no suffix, no 'similar', just the word


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Find EXACT word in files, just the word: no prefix, no suffix, no 'similar', just the word
# 1  
Old 03-27-2012
Data Find EXACT word in files, just the word: no prefix, no suffix, no 'similar', just the word

I have a file that has the words I want to find in other files (but lets say I just want to find my words in a single file). Those words are IDs, so if my word is ZZZ4, outputs like aaZZZ4, ZZZ4bb, aaZZZ4bb, ZZ4, ZZZ, ZyZ4, ZZZ4.8 (or anything like that) WON'T BE USEFUL.

I need the whole word (and the line where it is), but the exact word, nothing more and nothing less. The only thing that could be useful would be (ZZZ).

Of course, I need THE ENTIRE output line, so using grep -o, would not be useful either.

I've tried so many things. Nothing had worked fine.

-----

ok, lets say that my file that has the IDs is called 'Id', i've tried:


Code:
grep -wf Id file >> outputFile


If my first Id was 'Sun', the previous line gives me outputs like 'Sunny' (it also gives me 'Sun' as an output, that's ok). NOT GOOD!

I've tried several grep options. I've tried fgrep. Not good. I've tried agrep, and it's not good because it is approximate, so if I were looking for 'Sunny', agrep could throw 'Suny' as an output.

The things I've not tried much are things that involve regular expressions (too scared). Anyway, if I was trying to use regular expressions, how could I do that if the words I am searching are in a file called Id?

Smilie

Please help Smilie!
# 2  
Old 03-28-2012
I am totally brand new to this so don't get mad if I'm wrong... But when I'm looking through the maillog for my name in a line, I do this:
# more maillog | grep "freddy"
which is what I think you're doing now. Have you tried putting the white space in the serach like " freddy "? I know that if you want to match spaces on either side you can use escaped s "\s" but have no example.
This User Gave Thanks to Freddythunder For This Post:
# 3  
Old 03-28-2012
try this

Code:
more filename| grep -i Id

This User Gave Thanks to hedkandi For This Post:
# 4  
Old 03-28-2012
Can you confirm the content of your control file please. i suspect that you have something like the following (using ^ to mark start of line and $ to mark the end):-
Code:
^Word1$
^Fred$
^Sun$
^Choc$

... so what you are matching is the string listed on each line (e.g. Sun) so this will match anywhere it finds Sun, e.g. Sunny, Sunday, WhitSun etc. What you might need is a tweak in your control file. What are the delimiters in the file you are searching? If they are spaces, then you need to generate you control file with spaces at the beginning and end of each line, so (using same markers):-
Code:
^ Word1 $
^ Fred $
^ Sun $
^ Choc $

That way, it will match the whole string including the spaces before and after. Of course you then have the problem that some records you want may only have a space before or after, so you might have to get a little more inventive.

Does this help?


If you already have a large control file to search with, then using vi you can enter the following commands to insert a space at the beggining and end of each line:-
Code:
:%s /^/ /
:%s /$/ /


I hope that this (and the explanation) helps, but feel free to write back with more examples where it doesn't quite work.



Robin
Liverpool/Blackburn, UK
This User Gave Thanks to rbatte1 For This Post:
# 5  
Old 03-28-2012
What happens when you use
Code:
grep -w Sun file

It should work:
Code:
$ echo "Sun
Sunny
It is Sunday, let's
" | grep -w Sun
Sun

So perhaps there is an extra character between Sun and ny in the file you are searching in? Have you checked with od -c?
Otherwise what is your OS and version?
This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 03-30-2012
Bug Thank you !!!

First of all, thank you all for your answers.Smilie

My OS is ubuntu 11.10.

I have a friend that is trying the same thing. In his computer the same command line works fine.

My control file does not have the ^ and $. Maybe I should add those to each line. My control looks like this (but much longer):

NGO2105
NGO1081
NGO2156
NGO2158


And I'm trying to look for those identification numbers in files that look like this (but longer):

B0C5G9_ACAM1 B0C5G9 158335687 AMAR329726:AM1_2537-MONOMER null null null 5681349 CP000828_GR null HBG284974 IPR009915 amr:AM1_2537 null HPLMLGF null null null null null PF07298 null CLSK872158 null null YP_001516859.1 null null null null CP000828:ABW27545.1 AM1_2537


So, sometimes I find things like NGO1081.1 and that is an incorrect value for me.

I have other type of files in which I'm searching for the identification numbers, but I think if it works with one it works with others.

Anyway, I found the problem doing some 'hello world'-like tests. Where my control file was like:

NGO2105
NGO1081


And the file I was searching in was like:

dghsahgjgdsNGO2105
NGO2105ghdshf
NGO2105
asghdfNGO2105dsfjh
(NGO2105)
NGO1081
NGO21


Then the output file was like:

dghsahgjgdsNGO2105
NGO2105ghdshf
NGO2105
asghdfNGO2105dsfjh
(NGO2105)


I was doing other tests right know, following several of your advises. And turned out that grep is doing it fine. I took some precautions like deleting hidden files (that were making some noise). The last command line that I used and work fine was:

Code:
fgrep -wf Id -h searchFiles/* >> outputFile

I don't know if the -f was necessary but it works fine with it.

Thank you for all your replies.Smilie
# 7  
Old 03-30-2012
Quote:
Originally Posted by chicchan
My control file does not have the ^ and $. Maybe I should add those to each line.
You don't need them in the file itself.

^ and $ have special meanings to grep -- ^ means "Beginning of the line", and "$" means "end of the line."

So "^word$" would search for a line that starts, and ends, with word. It wouldn't match "wword" or "wordd".

So this doesn't sound relevant to what you want, since you're not looking for a regex at the beginning or end of a line. You're looking for an exact string in a column.

awk is very good at dealing with columns... How about:

Code:
awk '
# Save the list of strings to look for into the array A, so A["stringtofind1"]=1 etc.
NR==FNR{ A[$1]++; next};

{      # Check each column for the exact string.  If found, print the line and break the loop.
        for(N=1; N<=NF; N++) if(A[$N]) { print ; break }
}' controlfile logfile


Last edited by Corona688; 03-30-2012 at 05:34 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find a word and increment the number in the word & save into new files

Hi All, I am looking for a perl/awk/sed command to auto-increment the numbers line in file, P1.tcl: run_build_model sparc_ifu_dec run_drc set_faults -model path_delay -atpg_effectiveness -fault_coverage add_delay_paths P1 set_atpg -abort_limit 1000 run_atpg -ndetects 1000 I would like... (6 Replies)
Discussion started by: jypark22
6 Replies

2. Shell Programming and Scripting

Find word in a line and output in which line the word occurs / no. of times it occurred

I have a file: file.txt, which contains the following data in it. This is a file, my name is Karl, what is this process, karl is karl junior, file is a test file, file's name is file.txt My name is not Karl, my name is Karl Joey What is your name? Do you know your name and... (3 Replies)
Discussion started by: anuragpgtgerman
3 Replies

3. Shell Programming and Scripting

Shell Script @ Find a key word and If the key word matches then replace next 7 lines only

Hi All, I have a XML file which is looks like as below. <<please see the attachment >> <?xml version="1.0" encoding="UTF-8"?> <esites> <esite> <name>XXX.com</name> <storeId>10001</storeId> <module> ... (4 Replies)
Discussion started by: Rajeev_hbk
4 Replies

4. Shell Programming and Scripting

perl lwp find word and print next word :)

hi all, I'm new there, I'm just playing with perl and lwp and I just successfully created a script for log in to a web site with post. I have a response but I would like to have something like this: I have in my response lines like: <div class="sender">mimi020</div> <some html code.....>... (3 Replies)
Discussion started by: vogueestylee
3 Replies

5. Shell Programming and Scripting

Compare two files word by word

I need to compare two files word by word using unix shell scripting. Could someone help me? I need the code which will compare the 1st word from file1 with 1st word from file2, 2nd word from file1 with 2nd word from file2 etc..., for all the lines. Example: File1: aaa bbb ccc ... (7 Replies)
Discussion started by: rsmohankumar
7 Replies

6. Shell Programming and Scripting

Find and replace a word in all the files (that contain the word) under a directory

Hi Everyone, I am looking for a simple way for replacing all the files under a directory that use the server "xsgd1234dap" with "xsdr3423pap". For Example: In the Directory, $pwd /home/nick $ grep -l "xsgd1234dap" *.sh | wc -l 119 I have "119" files that are still using... (5 Replies)
Discussion started by: filter
5 Replies

7. UNIX for Dummies Questions & Answers

Script to search for a particular word in files and print the word and path name

Hi, i am new to unix shell scripting and i need a script which would search for a particular word in all the files present in a directory. The output should have the word and file path name. For example: "word" "path name". Thanks for the reply in adv,:) (3 Replies)
Discussion started by: virtual_45
3 Replies

8. Shell Programming and Scripting

Find Exact word in file

Hi ALL, I want to search one string “20 “ i.e 20 with space. But my file where I am searching this “20 “ contain some data like 120 before image file truncated 220 Reports section succeeded 20 Transaction database .prd stopped 220 Reports section completed. When I search for the... (5 Replies)
Discussion started by: Jeevan Salunke
5 Replies

9. Shell Programming and Scripting

find a word in a file, and change a word beneath it ??

Hi all, I have a file with lines written somewhat like this. aaaa ccc aa linux browse = no xssxw cdcedc dcsdcd csdw police dwed dwd browse = no cdecec (2 Replies)
Discussion started by: vikas027
2 Replies

10. Shell Programming and Scripting

Can a shell script pull the first word (or nth word) off each line of a text file?

Greetings. I am struggling with a shell script to make my life simpler, with a number of practical ways in which it could be used. I want to take a standard text file, and pull the 'n'th word from each line such as the first word from a text file. I'm struggling to see how each line can be... (5 Replies)
Discussion started by: tricky
5 Replies
Login or Register to Ask a Question