Shell script to search a pattern in a directory and output number of find counts
I need a Shell script which take two inputs which are
1) main directory where it has to search and
2) pattern to search within main directory all files (.c and .h files)
It has to print number of pattern found in main directory & each sub directory.
main dir --> Total pattern found = 5
|
sub dir --> 3
|
sub dir --> 2
Don't take any of what follows personally. It is intended solely as a helpful critique.
None of the solutions quoted below is very good.
Quote:
Originally Posted by RudiC
This is not a very elegant solution, but as a starting point try
where countit is a shell script:
and main_dir and pattern need to be supplied by you.
Filenames with n occurrences of embedded whitespace will be counted n+1 times. wc -l would be a better choice.
If the pattern matches a directory name, that subdirectory's contents will be counted even if they do not match the pattern. ls -d will prevent this, but will not prevent the matching directory from being counted if the intent is to count only files.
If pattern were the script's first argument, the script would be compatible with the much more efficient -exec ... {} + syntax. The body of the script could then be put within a for-loop iterating over "$@".
Is there even any point in using ls for this? A for-loop which expands the pattern could easily sidestep all of these issues. Within the loop, test can avoid counting anything that isn't a regular file. Also, using the pattern to generate arguments for ls may face a stricter system length limit than the shell for-loop's list expansion.
In my opinion, it's not worth trying to fix this approach's bugs. Better to abandon it.
Quote:
Originally Posted by RudiC
Look at this thread to find a less clumsy solution than mine above. Still not too performant...
This will serach for all directories from current direcotry and will give count of number (+1) of files/dirs present in that directory.
Launching an entire shell once per filename is not an efficient approach.
If the pathname has whitespace or begins with a dash, there will be problems.
Why does that code make ls work harder for no reason? It is generating the long format and forcing a reverse time sort when the only thing done with the output is a line count?
Quote:
Originally Posted by RudiC
This is really performant, provided you have dirname on your system:
This suggestion is nearly a very good one. Unfortunately, it won't yield the desired result.
find will very likely not generate all of a directory's contents in one contiguous chunk. It will begin outputting file names from dir A (for example), then descend into A/B, then A/B/C, then back up to A/B, and finally resume where it left off in A. Even if your find does not behave that way, it is allowed to do so. When this happens, the result is multiple, non-consecutive counts for the same directory.
The output of find needs to be sorted before uniq sees it. Also, I think that instead of executing dirname once per filename, it would be better to use one instance of sed to filter find output.
For a massive amount of files, that sort could require a lot of memory. If necessary, one can trade memory for cpu by executing find once per directory (still much better than a full shell once per file):
If maxdepth is not available, recursion can still be avoided with a slightly cumbersome use of -prune.
Something to keep in mind: In some of the approaches the pattern is expanded by the shell and in others it's passed to find. The shell will not match a hidden file against a leading wildcard (?, *); find will.
Don't take any of what follows personally. It is intended solely as a helpful critique.
...
Regards,
Alister
Absolutely not! You may have inferred from the various edits that I have taken a multi step approximation to the problem and its solution. I wasn't happy with the first ones either, running scripts or multiple shells when descending directory trees. I'll carefully analyse your proposal, as I'm "always learning" (agama's motto) and I really appreciate every single of your posts.
On the other hand, it is quite intimidating to know that my posts are being scrutinized that carefully!
Last edited by RudiC; 08-08-2012 at 03:22 PM..
Reason: Addition
Experts,
Need your support for this awk script.
we have only one input file, all these column 1 and column 2 are in same file and have to do lookup for values in one file(column1 and column2) but output we need in another file
Need to grep row whose string contains 9K from column 1. When found... (6 Replies)
Hi Guys,
I am very new to shell script and I need your help here to write a script. Actually, I have a script abc.sh which don't get terminated itself. So I need to design a script to run this script, save the output to a file, search for a given string in the output and if it exists send those... (11 Replies)
I have files in a Linux directory . Some of the file is listed below
-rw-rw-r--. 1 roots roots 0 Dec 23 02:17 zzz_123_00000_A_1.csv
-rw-rw-r--. 1 roots roots 0 Dec 23 02:18 zzz_121_00000_A_2.csv
-rw-rw-r--. 1 roots roots 0 Dec 23 02:18 zzz_124_00000_A_3.csv
drwxrwxr-x. 2 roots roots 6 Dec 23... (4 Replies)
The awk below is supposed to count all the matching $5 strings and count how many $7 values is less than 20. I don't think I need the portion in bold as I do not need any decimal point or format, but can not seem to get the correct counts. Thank you :).
file
chr5 77316500 77316628 ... (6 Replies)
Hi
Need help for below coding scenario.
I have a file with say 4 lines as below.
DEFINE JOB TPT_LOAD_INTO_EMP_DET
( TDPID = @TPT_TDSERVER , USERNAME = @TPT_TDUSER ) ;
( 'DROP TABLE '||@TPT_WRKDB ||'.LOG_'||@TPT_TGT ||' ; ') ,
SELECT * FROM OPERATOR (FILE_READER) ; ) ;
Now I want to... (5 Replies)
I require a shell script to find if any new entry of dump files present in a particular directory and to send an email if any new entry exists.I had a crontab to run the script for every 5 min. Below are the file names.dump.20150327.152407.12058630.0002.phd.gz... (9 Replies)
Hello all,
this is my first and probably not my last question around here. I do hope you can help or at least point me in the right direction.
My question is as follows, I need to find files and possible folders which are not owner = AAA group = BBB with a said location and all sub folders ... (7 Replies)
Hi friends.. I have many dirs in my working directory. Every dir have thousands of files (.jsp, .java, .xml..., etc). So I am working with an script to find every file recursively within those directories and subdirectories ending with .jsp or .java which contains inside of it, the the pattern... (3 Replies)
hello,
i want to make a script to search the file contents in my home directory by a given date and output me the line that has the date... (10 Replies)
Suppose u have a file like
1 30
ABCSAAHSNJQJALBALMKAANKAMLAMALK
4562676268836826826868268468368282972982
2863923792102370179372012792701739729291
31 60... (8 Replies)