The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 05-17-2008
m00 m00 is offline
Registered User
  
 

Join Date: May 2008
Posts: 3
Help!!

Hi, I need help.

I have a couple of things I got stuck on

1)
I have a text file containing 25k search string that I need to search against compressed file. I have used this command but somehow it doesn't seems to use all the search terms.

I have used zgrep --colour=always -nf [name of txt file] name of compressed file

I know the file contains those search string (I have tested with a few of them). somehow it didn't show anything.


2)
With the same text file 1) , I need to search against a folder containing 50k email messages (.eml format). If the .eml file contains match search string, it will move to another folder. So I can run a batch print later on.


Please help!!!

Cheers
  #2 (permalink)  
Old 05-17-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
Is the search string a single long string over multiple lines which you want to find in exactly that order? grep -f and friends generally read a file of search expressions, one per line.

Are the email messages one per file, or is this a single file containing multiple messages? The .eml extension is not well standardized; it could be either.

If you have one message per file, grep -l searchstring *.eml will list the ones which match, but again, that's assuming the search string is shorter than maximum one line.

You really could take the time to think of a thread topic which would identify this thread among the others; basically, everyone who posts here wants help, some urgently.
  #3 (permalink)  
Old 05-17-2008
m00 m00 is offline
Registered User
  
 

Join Date: May 2008
Posts: 3
problem with grep string pattern file over multiple files

note taken on the thread post.

It's one search string per line in the txt file (I clean up the domain name down to just the word ie abc.com to abc).

I have both type of email message file, 1) one single file containing multiple messages and 2) 50000 individual email message in .eml format.

The problem with the search, it doesn't seem to run all the search string contains in the txt file against the target file. It seems only a few lines of search string is used.
  #4 (permalink)  
Old 05-18-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
Can you split up the search file into smaller chunks? My experience is that grep will complain if the patterns file is too large, but there are probably tools which will simply truncate the patterns if they won't fit into the pattern buffer. If all the hits are on patterns near the beginning of the file, that would confirm this (admittedly somewhat weak) hypothesis.

If you can install GNU grep, at least it will complain if the patterns buffer is too large. Try also adding the -F option if your zgrep supports that; or, uncompress the files temporarily, and use fgrep.

To copy message files which match one of the patterns in the patterns file, use something like

Code:
cp `fgrep -f patterns.txt -l -r /path/to/messages` /path/to/copy/to
It's much easier if you have one message per file, although there are tools to grep for messages in an mbox file of messages, too. (See if you have a tool called mailgrep on your system. You can also run the messages through procmail if you're familiar with that, but I'm guessing you are not, and this box is too small to begin to explain.)
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 02:19 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0