Sponsored Content
Top Forums Shell Programming and Scripting grep 1000s of files with 1000s of grep values Post 302720337 by jim mcnamara on Tuesday 23rd of October 2012 07:28:57 PM
Old 10-23-2012
This task will definitely complete before the next ice age sets in. (humor... sort of)

Consider adding some parallelism. This will only do well on a multi-cpu or box with a cpu that supports the equivalent of hyperhtreads. rdrtx1's solution is as good as it gets for a single cpu box. You may be able to run two processes in parallel. I do not know.

split your pattern file into several smaller files, because the more lines you have in the pattern file the more cpu is spent looking at each line in the search file.

Example with 1000 line file split into n x m line files: 4 X 250 or 8 x 125 might be better.

This benefits from disk controller caching and having grep run through fewer lines of patterns for each line of source. Let's say you think 8 parallel processes will do well.
Some systems do NOT do better with this, so set up a small test first.
Code:
#/bin/bash
cd /directory/with/zillions/of/files

> /path/to/result

ls | while read fname
do
 grep -f /path/to/file1  $fname >> /path/to/result  & 
 grep -f /path/to/file2  $fname >> /path/to/result  &
 grep -f /path/to/file3  $fname >> /path/to/result  &
 grep -f /path/to/file4  $fname >> /path/to/result  &
 grep -f /path/to/file5  $fname >> /path/to/result  &
 grep -f /path/to/file6  $fname >> /path/to/result  &
 grep -f /path/to/file7  $fname >> /path/to/result  &
 grep -f /path/to/file8  $fname >> /path/to/result  &
 wait
done

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

grep a list of values

Hi everybody! :) :D :D :) it's great to be here since this is my first post. touch /base/oracle/FRA/XMUT00/RMAN_FLAG touch /base/oracle/FRA/XRLL00/RMAN_FLAG find directory name containing RMAN_FLAG : $ find /base/oracle/FRA -name RMAN_FLAG -print|xargs -n1 dirname |sort -u... (3 Replies)
Discussion started by: jolan_louve
3 Replies

2. UNIX for Dummies Questions & Answers

grep using ASCII values

machine: HPUX file: a.dat contents: decimal 1 decimal 2 string 1 string 2 ASCII value of 'd': 100. to grep lines that have 'd', I use the following command grep d a.dat My requirement: I should grep for lines that contain 'd'. But I should use ASCII value of 'd' in the command... (1 Reply)
Discussion started by: sriksama
1 Replies

3. Shell Programming and Scripting

MEM=`ps v $PPID| grep -i db2 | grep -v grep| awk '{ if ( $7 ~ " " ) { print 0 } else

Hi Guys, I need to set the value of $7 to zero in case $7 is NULL. I've tried the below command but doesn't work. Any ideas. thanks guys. MEM=`ps v $PPID| grep -i db2 | grep -v grep| awk '{ if ( $7 ~ " " ) { print 0 } else { print $7}}' ` Harby. (4 Replies)
Discussion started by: hariza
4 Replies

4. Shell Programming and Scripting

grep two values together.

Hi... I have a file abc.txt , havin more then 10,000 lines, each field separated by '#'. I want to grep 9914699895 and 999 from abc.txt I am trying cat abc.txt | grep 9914699895 | grep 999 but i am also getting data like 9991111111 or 9991010101 I want to grep "999" exactly and... (1 Reply)
Discussion started by: tushar_tus
1 Replies

5. UNIX for Advanced & Expert Users

Moving 1000s of files to another folder

Hi, I need to move 1000s of files from one folder to another. Actually there are 100K+ files. Source dir : source1 Target dir : target1 Now if try cp or mv commands I am getting an error message : Argument List too long. I tried to do it by the time the files are created in the... (1 Reply)
Discussion started by: unx100
1 Replies

6. Shell Programming and Scripting

grep for certain files using a file as input to grep and then move

Hi All, I need to grep few files which has words like the below in the file name , which i want to put it in a file and and grep for the files which contain these names and move it to a new directory , full file name -C20091210.1000-20091210.1100_SMGBSC3:1000... (2 Replies)
Discussion started by: anita07
2 Replies

7. Shell Programming and Scripting

grep distinct values

this is a little more complex than that. I have a text file and I need to find all the distinct words that appear in a line after the word TABLESPACE when I grep for just the word tablespace, I get: how do i parse this a little better so i have a smaller file to read? This is just an... (4 Replies)
Discussion started by: guessingo
4 Replies

8. Cybersecurity

1000s of undelivered email messages

Hi, My boss has suddenly started receiving 1000s of messages in his inbox. They are undelivered messages that are bouncing back, though the emails weren't coming from him. I guess either these are fake undelivered messages and are just scam emails. Or they are real emails being sent with spoofed... (1 Reply)
Discussion started by: timgolding
1 Replies

9. UNIX Desktop Questions & Answers

How do you [e]grep for multiple values within multiple files?

Hi I'm sure there's a way to do this, but I ran out of caffeine/talent before getting the answer in a long winded alternate way (don't ask ;) ) The task I was trying to do was scan a directory of files and show only files that contained 3 values: I940 5433309 2181 I tried many variations... (4 Replies)
Discussion started by: callumw
4 Replies

10. Shell Programming and Scripting

Inconsistent `ps -eaf -o args | grep -i sfs_pcard_load_file.ksh | grep -v grep | wc -l`

i have this line of code that looks for the same file if it is currently running and returns the count. `ps -eaf -o args | grep -i sfs_pcard_load_file.ksh | grep -v grep | wc -l` basically it is assigned to a variable ISRUNNING=`ps -eaf -o args | grep -i sfs_pcard_load_file.ksh |... (6 Replies)
Discussion started by: wtolentino
6 Replies
ZGREP(1)						      General Commands Manual							  ZGREP(1)

NAME
zgrep - search possibly compressed files for a regular expression SYNOPSIS
zgrep [ grep_options ] [ -e ] pattern filename... DESCRIPTION
Zgrep invokes grep on compressed or gzipped files. These grep options will cause zgrep to terminate with an error code: (-[drRzZ]|--di*|--exc*|--inc*|--rec*|--nu*). All other options specified are passed directly to grep. If no file is specified, then the standard input is decompressed if necessary and fed to grep. Otherwise the given files are uncompressed if necessary and fed to grep. If the GREP environment variable is set, zgrep uses it as the grep program to be invoked. EXIT CODE
2 - An option that is not supported was specified. AUTHOR
Charles Levert (charles@comm.polymtl.ca) SEE ALSO
grep(1), gzexe(1), gzip(1), zdiff(1), zforce(1), zmore(1), znew(1) ZGREP(1)
All times are GMT -4. The time now is 04:54 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy