Help!!


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help!!
# 1  
Old 05-17-2008
Help!!

Hi, I need help.

I have a couple of things I got stuck on

1)
I have a text file containing 25k search string that I need to search against compressed file. I have used this command but somehow it doesn't seems to use all the search terms.

I have used zgrep --colour=always -nf [name of txt file] name of compressed file

I know the file contains those search string (I have tested with a few of them). somehow it didn't show anything.


2)
With the same text file 1) , I need to search against a folder containing 50k email messages (.eml format). If the .eml file contains match search string, it will move to another folder. So I can run a batch print later on.


Please help!!!

Cheers
# 2  
Old 05-17-2008
Is the search string a single long string over multiple lines which you want to find in exactly that order? grep -f and friends generally read a file of search expressions, one per line.

Are the email messages one per file, or is this a single file containing multiple messages? The .eml extension is not well standardized; it could be either.

If you have one message per file, grep -l searchstring *.eml will list the ones which match, but again, that's assuming the search string is shorter than maximum one line.

You really could take the time to think of a thread topic which would identify this thread among the others; basically, everyone who posts here wants help, some urgently.
# 3  
Old 05-17-2008
problem with grep string pattern file over multiple files

note taken on the thread post.

It's one search string per line in the txt file (I clean up the domain name down to just the word ie abc.com to abc).

I have both type of email message file, 1) one single file containing multiple messages and 2) 50000 individual email message in .eml format.

The problem with the search, it doesn't seem to run all the search string contains in the txt file against the target file. It seems only a few lines of search string is used.
# 4  
Old 05-18-2008
Can you split up the search file into smaller chunks? My experience is that grep will complain if the patterns file is too large, but there are probably tools which will simply truncate the patterns if they won't fit into the pattern buffer. If all the hits are on patterns near the beginning of the file, that would confirm this (admittedly somewhat weak) hypothesis.

If you can install GNU grep, at least it will complain if the patterns buffer is too large. Try also adding the -F option if your zgrep supports that; or, uncompress the files temporarily, and use fgrep.

To copy message files which match one of the patterns in the patterns file, use something like

Code:
cp `fgrep -f patterns.txt -l -r /path/to/messages` /path/to/copy/to

It's much easier if you have one message per file, although there are tools to grep for messages in an mbox file of messages, too. (See if you have a tool called mailgrep on your system. You can also run the messages through procmail if you're familiar with that, but I'm guessing you are not, and this box is too small to begin to explain.)
 
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question