recursive wc on a directory?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers recursive wc on a directory?
# 8  
Old 06-19-2009
Swings and roundabouts. For data amounting to a million lines my suggestion is slightly slower for say 20,000 small files but faster for a small number of large files because "cat" is more efficient than "wc" at reading from disc. My method gives much less of a cpu hit. In the real world I use both constructs.

Code:
This is interesting.

time wc -l bigfile
5487935 bigfile

real       12.1
user       10.7
sys         1.3

time cat bigfile|wc -l

real       12.0
user        0.1
sys         2.5
5487935

# 9  
Old 06-19-2009
you only had one run of each. do more runs and find out.
# 10  
Old 06-19-2009
find . -type f | xargs cat |wc -l[COLOR="#738fbf"]

replacing . with the directory name ;-)

(not sure what good timing wc is. But interesting. I'm guessing loading two programs (wc and cat) into memory is slower that loading just one? I'd love it if Microsoft would bear that in mind! (oops, probably violated a rule there... sorry)

Last edited by Scott; 06-20-2009 at 07:43 PM..
# 11  
Old 06-20-2009
Hmm. "scottn" idea breaks if any filenames contain space characters.


I ran my test a few times first to eliminate o/s first-time buffering before posting those figures. Results were still interesting.

The "pludi" solution is very good and exhibits lateral thought, but on this occasion the UUOC argument is arguable because on my system "wc" is less efficent at reading files from disc than "cat".

BTW. I can produce the required output by using only "find" and shell commands but it proved to be horrendously slow to read large volumes of data with a shell read.



There is life in the old cat yet!
# 12  
Old 06-20-2009
Smilie

Sorry, I got a bit confused as to what thread I was writing in when I said that. I meant no disrespect. It is interesting.

You're right, but Unix filenames don't "normally" have spaces.

Time: 8 - 10 seconds (on some FS full of rubbish on my system):
Code:
find . -type f | xargs -I{} cat "{}" | wc -l

Time: 2 - 3 seconds (but doesn't handle some files):
Code:
find . -type f | xargs cat | wc -l

Ergo: awk rocks (and cat is cool) Smilie
# 13  
Old 06-21-2009
Quote:
Originally Posted by scottn
Smilie

Sorry, I got a bit confused as to what thread I was writing in when I said that. I meant no disrespect. It is interesting.

You're right, but Unix filenames don't "normally" have spaces.

Time: 8 - 10 seconds (on some FS full of rubbish on my system):
Code:
find . -type f | xargs -I{} cat "{}" | wc -l

Time: 2 - 3 seconds (but doesn't handle some files):
Code:
find . -type f | xargs cat | wc -l

Ergo: awk rocks (and cat is cool) Smilie
Did you run each example more than once, or are the examples you posted exactly what you did test?

After the first run, depending on your filesystem and OS, there's a good chance that a lot of the data is going to be cached and not read from disk.
# 14  
Old 06-21-2009
I ran them lots of times.... but I was running Linux in a VM on Windows. And who knows what that get's up to while you're not looking!

The point is, it seems, trusting that your filenames don't have spaces - and you don't have to check for them, it's quicker than if you do need to check.

In any case, the awk solution was nicer.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Chattr recursive exclude directory

Attempting to recursive chattr directories while excluding a directory, however the command which works with chown does not seem to with chattr find /mysite/public_html ! -wholename '/mysite/public_html/images' -type d -exec chattr -R +i {} \; find /mysite/public_html -not -path "*/images*"... (2 Replies)
Discussion started by: carnagel
2 Replies

2. UNIX for Dummies Questions & Answers

recursive copy into a directory and all its subdirectories...

I want to copy a file from the top directory into all the sub-folders and all of the sub-folders of those sub-folder etc. Does anyone have any idea how to do this? Thanks in advance of any help you can give. (3 Replies)
Discussion started by: EinsteinMcfly
3 Replies

3. UNIX for Advanced & Expert Users

Recursive directory search using ls instead of find

I was working on a shell script and found that the find command took too long, especially when I had to execute it multiple times. After some thought and research I came up with two functions. fileScan() filescan will cd into a directory and perform any operations you would like from within... (8 Replies)
Discussion started by: newreverie
8 Replies

4. Programming

Recursive remove directory.

What is the best way to completely remove dir with it's content ??? rmdir deletes only EMPTY dirs as i know. The man page of remove function says "remove() deletes a name from the file system." Can it remove any dir recursively ??? :rolleyes: (7 Replies)
Discussion started by: Trump
7 Replies

5. UNIX for Dummies Questions & Answers

recursive directory listing with ownership

i'm playing around with "ls" and "find" and am trying to get a print out of directories, with full path, (recursive) and their ownership.... without files or package contents (Mac .pkg or .mpkg files). I'd like it simply displayed without much/any extraneous info. everything i've tried, and... (5 Replies)
Discussion started by: alternapop
5 Replies

6. Programming

recursive copy of the directory

I want to copy a directory recursively ( it again has directories) and the directory is on windows and is nfsmounted in vxWorks, i am using unix to develop the code for this, can any one suggest me how to copy the directories recursively. (7 Replies)
Discussion started by: deepthi.s
7 Replies

7. Shell Programming and Scripting

non recursive search in the current directory only

Hi, Am trying for a script which should delete more than 15 days older files in my current directory.Am using the below piece of code: "find /tmp -type f -name "pattern" -mtime +15 -exec /usr/bin/ls -altr {} \;" "find /tmp -type f -name "pattern" -mtime +15 -exec /usr/bin/rm -f {} \;" ... (9 Replies)
Discussion started by: puppala
9 Replies
Login or Register to Ask a Question