Best Alternative to Search Text strings in directory
Hi All,
We have a file "Customers.lst". It contains list of all the Customers.
There is directory which has number of text files and each file containing name of defaulter customers.
We want to search for all the customers available in "Customers.lst" file against the list of files containing the name of default customers and then
print the Default customer name and the file in which it is available.
Currently we are reading Customers.lst file line by line and then doing grep in the directory.
Is there a better way to do this operation or any suggestions on the above command to make it faster.
REgards,
Arun M
Last edited by Scott; 05-08-2011 at 09:15 AM..
Reason: Added code tags
Please use code tags.
A few minor suggestions to your code:
= Useless Use Of Cat; no need
= '-name "*"' is redundant, be more specific or omit, if searching through all files
= sed doesn't need to do matching; use anchor '^'
= do use double quotes in grep pattern variable (no need inside sed command though)
Is there a reason why you 'cd' in each loop iteration? Does $dir change?
a) <file redirection is not too much more efficient than your for loop. It really just saves one process (cat). The best is to avoid shell loop and run the file through a filter, like awk.
And since you're not using find(1) to do anything super useful, it can be omitted as well. Something along these lines:
This will print the commands to stdout. Look at it, take one, and execute it, and if everythinh seems well, pipe it to bash and capture the output to a file:
b) if you omit double quotes and you're searching for a 2-word pattern, grep takes second word as the file to do search on:
Sorry, the redirection into variable doesn't work as expected. The following, inserting it straight from shell, works fine.
Let me format it nicer:
Input:
Outputs:
So awk is used to print out a bunch of commands with your names from Customers.lst. Nicely each command on one line.
Now what does each command do when you execute it:
will search for "John Doe", case insensitive (-i), in all files present (*) and output the filename only if string is found (-l).
So if you have 3 log files present in current directory:
and only one -- file2.log -- contains "John Doe", then:
that's the filename that grep returns.
Now you want <name>, <filenameThatContains_name>
so that's what sed does; it inserts 'John Doe, ' in the beginning of line ('^'), so:
Those '%s' are format specifiers for printf, it's telling printf to print string, and $0,$0 are arguments to printf, telling it to substitute $0 (which is the whole record -- the name (if you don't have anything else on line of Customers.lst) in awk) for %s.
So the awk magic is just to format and print out a nice command to the screen. Then you can test out one of those commands, if you wish, or just assess them visually (useful for debugging, before you actually run them).
Then pipe to bash to execute, and redirect to capture output.
You might need to insert a directory name for grep's argument, instead of just plain '*' to adjust for details of your dir structure.
I know it's not quite elegant, but should be faster.
Loops in shell are not nearly as efficient. Awk excels in the speed of reading lines from input -- it's a filter, after all, well crafted for this particular purpose. Then you are gonna launch a process for each grep; this can be optimized further (parallelized onto the CPUs, e.g).
If you want, you can do a little benchmark, take a subset of your logfiles, and process them with our script and mine.
Run it with 'time' like:
which will spit out the time it takes for script to finish. I'd be curious...
Also I can't wait for answers of more experienced *nix people.
Here is my sample file data:
My requirement is to have a regex expression that is able to search for visible starting string "SSLInsecureRenegotiation Off" between strings "<VirtualHost " and "</VirtualHost>".
In the sample data two lines should be matched.
Below is what I tried but... (5 Replies)
Hi All,
I hope somebody would be able to help me.
I would need to search a string coming from a file, example file.txt:
dog
cat
goat
horse
fish
For every string, I would need to know if there are any files inside a directory(recursively) that contains the string regardless of case.... (9 Replies)
Hi guys,
I have a text file named file1.txt that is formatted like this:
001 , ID , 20000
002 , Name , Brandon
003 , Phone_Number , 616-234-1999
004 , SSNumber , 234-23-234
005 , Model , Toyota
007 , Engine ,V8
008 , GPS , OFF
and I have file2.txt formatted like this:
... (2 Replies)
Based on the forums i have tried with grep command but i am unable to get the required output.
search this value /*------
If that is found then search for temp_vul and print
and also search until /*------- and print new_vul
Input file contains:
... (5 Replies)
I am trying to do the following task :
export ENV=aaa
export ENV_PATH=$(cd /apps | ls | grep $ENV)
However, it's not working. What's the way to change to directory and search some file in that directory in single command
Please help. (2 Replies)
Hi There...
I need to serach and replace a strings in a text file.
My file has; books.amazon='Let me read' and the output needed is
books.amazon=NONFOUND
pls if anybody know this can be done in script sed or awk.. i have a list of different strings to be repced by NONFOUND.... (7 Replies)
Hi Everybody,
I am just new to UNIX as well as to this forum. I have a text file with 10,000 coloumns and each coloumn contains values separated by space. I want to separate them into new coloumns..the file is something like this
as ad af 1 A
as ad af 1 D
...
...
1 and A are in one... (7 Replies)
I need to search for a particular string. This string might be present in many files. The directory in which I am present has more than one subdirectories. Hence, the search should check in all the subdirectories and all the corresponding files and give a list of files which have the particular... (5 Replies)
I was google searching and found
Perl as a command line utility tool
This almost solves my problem:
find . | xargs perl -p -i.old -e 's/oldstring/newstring/g'
I think this would create a new file for every file in my directory tree. Most of my files will not contain oldstring and I... (1 Reply)
Hi have Input in this way
KEY AAAA
BBBB
END1
KEY AAAA
BBBB
END2
KEY AAAA
BBBB
END3
I need to find any thing matching in between KEY And ending with "END1|END2|END3"
This didnot work
awk '/KEY/,/END1|END2|END3/' (3 Replies)