The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Recursive FTP -- here at last. Perderabo Shell Programming and Scripting 52 03-25-2009 12:15 PM
Recursive SFTP MohanTJ Security 1 05-19-2008 12:17 AM
recursive rcp Nicol Shell Programming and Scripting 6 11-06-2003 11:52 AM
Recursive directory listing without listing files psingh UNIX for Dummies Questions & Answers 4 05-10-2002 10:52 AM
Recursive FTP aslamg UNIX for Dummies Questions & Answers 1 03-08-2001 04:27 AM

Reply
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 06-19-2009
methyl methyl is offline
Registered User
  
 

Join Date: Mar 2008
Posts: 1,168
Swings and roundabouts. For data amounting to a million lines my suggestion is slightly slower for say 20,000 small files but faster for a small number of large files because "cat" is more efficient than "wc" at reading from disc. My method gives much less of a cpu hit. In the real world I use both constructs.

Code:
This is interesting.

time wc -l bigfile
5487935 bigfile

real       12.1
user       10.7
sys         1.3

time cat bigfile|wc -l

real       12.0
user        0.1
sys         2.5
5487935
  #2 (permalink)  
Old 06-19-2009
scottn scottn is online now Forum Advisor  
VIP Member
  
 

Join Date: Jun 2009
Location: Zürich, CH
Posts: 1,051
find . -type f | xargs cat |wc -l[COLOR="#738fbf"]

replacing . with the directory name ;-)

(not sure what good timing wc is. But interesting. I'm guessing loading two programs (wc and cat) into memory is slower that loading just one? I'd love it if Microsoft would bear that in mind! (oops, probably violated a rule there... sorry)

Last edited by scottn; 06-20-2009 at 06:43 PM..
  #3 (permalink)  
Old 06-20-2009
methyl methyl is offline
Registered User
  
 

Join Date: Mar 2008
Posts: 1,168
Hmm. "scottn" idea breaks if any filenames contain space characters.


I ran my test a few times first to eliminate o/s first-time buffering before posting those figures. Results were still interesting.

The "pludi" solution is very good and exhibits lateral thought, but on this occasion the UUOC argument is arguable because on my system "wc" is less efficent at reading files from disc than "cat".

BTW. I can produce the required output by using only "find" and shell commands but it proved to be horrendously slow to read large volumes of data with a shell read.



There is life in the old cat yet!
  #4 (permalink)  
Old 06-20-2009
scottn scottn is online now Forum Advisor  
VIP Member
  
 

Join Date: Jun 2009
Location: Zürich, CH
Posts: 1,051


Sorry, I got a bit confused as to what thread I was writing in when I said that. I meant no disrespect. It is interesting.

You're right, but Unix filenames don't "normally" have spaces.

Time: 8 - 10 seconds (on some FS full of rubbish on my system):
Code:
find . -type f | xargs -I{} cat "{}" | wc -l
Time: 2 - 3 seconds (but doesn't handle some files):
Code:
find . -type f | xargs cat | wc -l
Ergo: awk rocks (and cat is cool)
  #5 (permalink)  
Old 06-21-2009
achenle achenle is offline
Registered User
  
 

Join Date: Jun 2009
Posts: 76
Quote:
Originally Posted by scottn View Post


Sorry, I got a bit confused as to what thread I was writing in when I said that. I meant no disrespect. It is interesting.

You're right, but Unix filenames don't "normally" have spaces.

Time: 8 - 10 seconds (on some FS full of rubbish on my system):
Code:
find . -type f | xargs -I{} cat "{}" | wc -l
Time: 2 - 3 seconds (but doesn't handle some files):
Code:
find . -type f | xargs cat | wc -l
Ergo: awk rocks (and cat is cool)
Did you run each example more than once, or are the examples you posted exactly what you did test?

After the first run, depending on your filesystem and OS, there's a good chance that a lot of the data is going to be cached and not read from disk.
  #6 (permalink)  
Old 06-19-2009
ghostdog74 ghostdog74 is offline Forum Advisor  
Registered User
  
 

Join Date: Sep 2006
Posts: 2,524
you only had one run of each. do more runs and find out.
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 05:31 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0