Make script faster

12-22-2011

Registered User

51, 0

Join Date: Mar 2009

Last Activity: 10 January 2014, 4:13 AM EST

Location: Barcelona - Catalonia

Posts: 51

Thanks Given: 8

Thanked 0 Times in 0 Posts

Hi,

vivek d r, I actually didn't make an script. I just ran whole script from command line, preceded by time, ie:

Quote:

time cat $file | while read line; do grup=`echo "$line" | cut -d " " -f3`; if [ "$grup" == "27" ]; then exp=`echo "$line" | cut -d " " -f2`; echo $exp; fi; done;

rbatte1, I undestand that you recommend me to learn awk for scripting?
I ran command Franklin52 said (using awk), and results where awesome.

Quote:

real 0m0.285s
user 0m0.030s
sys 0m0.062s

And using redirect (while.... < $file) results were also very good:

Quote:

real 0m0.429s
user 0m0.139s
sys 0m0.155s

Why is there so much difference in performance using redirect rather than using pipes as I did? Could be because using redirection whole script runs in one shell, while using pipes (cat $file | while...) use several shells?

And even more, why awk (which is a program) has better performance than bash built-in commands?

Thank you very much for all answers, they help me a lot. And sorry for my English

Albert.

AlbertGM

View Public Profile for AlbertGM

Find all posts by AlbertGM

12-22-2011

Moderator

3,843, 841

Join Date: Jun 2007

Last Activity: 29 June 2020, 12:30 PM EDT

Location: Lancashire, UK

Posts: 3,843

Thanks Given: 2,004

Thanked 841 Times in 727 Posts

The main reason is that your original had the following logic:-

Start a process to read a line from the input
Start a process to perform the cut *1
Do a compare looking for value 27
If we match, start a process for another cut *2
Display the result
Start from top to read next line

For a 400 line file, you are forcing 400 cut processes to be run for *1 and another set for the cut in *2
Depending on your shell, you might start 400 read processes, plus 400 echo statements in *1 and more for *2 for each line matching value 27.

All of this generates vast amounts of work just in the overheads. I'm not very good with awk myself but it all runs in a single process so is excellent if you can invest the time to get into the syntax. My variation removed many of these processes, but probably could still be improved. Every process launch requires memory to be allocated, perhaps logs to be written, paging/swap space to be altered etc, so before it actually does anything, there is a significant processing overhead - and then there may be end-of-process overheads too.

The use of the cat at the front makes it more readable for some, although I'm sure purists may not agree. I suppose it depends how you describe your logic in your mind before writing code. I just tried to follow your logic with a few tweaks so it doesn't become too different and need documentation or lots of work on your part to decipher, but it's the difference between thinking:-

Working on this file, I will do these things to it, versus
Do these things on this input file

I hope that this clarifies and helps,

Robin
Liverpool/Blackburn
UK

rbatte1

View Public Profile for rbatte1

Visit rbatte1's homepage!

Find all posts by rbatte1

12-22-2011

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Quote:

Originally Posted by AlbertGM

And even more, why awk (which is a program) has better performance than bash built-in commands?

I hope this will help:

Ksh built-in functions

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

12-22-2011

Registered User

51, 0

Join Date: Mar 2009

Last Activity: 10 January 2014, 4:13 AM EST

Location: Barcelona - Catalonia

Posts: 51

Thanks Given: 8

Thanked 0 Times in 0 Posts

Yes, indeed. Both really help!
Thanks.

By the way, I forgot to tell you I was using cygwin to run those commands and scripts, although it probably doesn't make any difference in all you told me.

Albert.

AlbertGM

View Public Profile for AlbertGM

Find all posts by AlbertGM

Shell Programming and Scripting

Make script faster

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to make faster loop in multiple directories?

Discussion started by: baris35

2. Shell Programming and Scripting

How to make awk command faster for large amount of data?

Discussion started by: brenoasrm

3. Shell Programming and Scripting

How to make awk command faster?

Discussion started by: Peu Mukherjee

4. Shell Programming and Scripting

awk changes to make it faster

Discussion started by: mirwasim

5. Shell Programming and Scripting

Running rename command on large files and make it faster

Discussion started by: shoaibjameel123

6. Shell Programming and Scripting

How to make copy work faster

Discussion started by: prasperl

7. Red Hat

Re:How to make the linux pc faster

Discussion started by: venky_vemuri

8. Shell Programming and Scripting

awk help to make my work faster

Discussion started by: kumar_amit

9. Shell Programming and Scripting

Can anyone make this script run faster?

Discussion started by: shew01

10. Solaris

looking for different debugger for Solaris or to make sunstudio faster

Discussion started by: umen