Shells, forks, subprocesses... oh my


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shells, forks, subprocesses... oh my
# 1  
Old 03-10-2017
Shells, forks, subprocesses... oh my

all,
i've been reading to try and get an abstract idea of the process effeciency of commands , sed, bash, perl, awk, find, grep, etc

which processes will spawn?, fork?, launch subshell?, etc and under what conditions?
how do you know which commands have the faster and better stdio implementation?

and so am looking for some guru advice instead of running thousands of use cases for different configurations.

example: finding a specific line in a multiple files spanning a volume

i can use something like this

Code:
sed 'LINENOq;d' $dir/$filename

which seems very fast for searching many(60,000+) of files <10 kb ascii, UTF-8 but one could also use

Code:
tail -n+LINENO $dir/$filename | head -n1

which seems fairly fast as well, one could also probably come up with a few one liners in perl.
# 2  
Old 03-10-2017
sed, bash, perl, awk, find, grep are all processes. A subshell is a process. A fork is a fork is a fork.

Whether any of these are faster or slower than other ways to solve your problem, really depends on your problem, and the algorithm you use to solve it. So "one solution to solve everything, forever" may be out the window.

There's some cardinal sins to avoid:
  • Don't reprocess the same file repeatedly. You can almost always do everything in one pass that you could do in two.
  • Don't launch whole processes to process tiny amounts of data. echo "a b c" | awk '{ print $1 }' is a tragic waste, this is when shell builtins would be thousands of times more efficient.
  • Running your innermost loop in the shell will be slow. A while read loop line by line over a file will be slower than awk '{ something }' filename. Shell is for the high level things, not the nitty gritty bulk work. This is when externals would be thousands of times more efficient.
  • If you're doing cat | awk | sed | cut | tr | kitchen | sink, put it all in one awk. awk is a programming language which is capable of replacing all of these with some near-trivial code, and one awk call will be faster than ten anything else.
  • Useless Use of Cat. Don't do that. Nothing needs a cat | in front of it to read a file.
This User Gave Thanks to Corona688 For This Post:
# 3  
Old 03-15-2017
Quote:
Originally Posted by Corona688
sed, bash, perl, awk, find, grep are all processes. A subshell is a process. A fork is a fork is a fork.

Whether any of these are faster or slower than other ways to solve your problem, really depends on your problem, and the algorithm you use to solve it. So "one solution to solve everything, forever" may be out the window.

There's some cardinal sins to avoid:
  • Don't reprocess the same file repeatedly. You can almost always do everything in one pass that you could do in two.
  • Don't launch whole processes to process tiny amounts of data. echo "a b c" | awk '{ print $1 }' is a tragic waste, this is when shell builtins would be thousands of times more efficient.
  • Running your innermost loop in the shell will be slow. A while read loop line by line over a file will be slower than awk '{ something }' filename. Shell is for the high level things, not the nitty gritty bulk work. This is when externals would be thousands of times more efficient.
  • If you're doing cat | awk | sed | cut | tr | kitchen | sink, put it all in one awk. awk is a programming language which is capable of replacing all of these with some near-trivial code, and one awk call will be faster than ten anything else.
  • Useless Use of Cat. Don't do that. Nothing needs a cat | in front of it to read a file.
Many thanks. This is what I was looking for, general rule of thumb.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to kill a script and all its subprocesses?

I'd like to terminate scripts. These scripts are not by me, but other persons. These contain other programs such as ping, nmap, arp, etc. If such a script is running, how can I terminate this script AND all processes called by this script? These scripts are big so terminating all programs... (4 Replies)
Discussion started by: lordofazeroth
4 Replies

2. UNIX for Dummies Questions & Answers

Please what are shells?

I mean like this: http://shells.red-pill.eu/ Can anyone explain how this works? I hope my post is not spam. I think its related to linux. Thank you (1 Reply)
Discussion started by: postcd
1 Replies

3. UNIX for Dummies Questions & Answers

What is meant by subprocesses?

I'm going through my UNIX book and came across a section on Customization and Subprocesses. Can someone tell me what a subprocess is -- for example, when the book says "Which shell 'thing' are known to subprocesses" what exactly does it mean? The book just talks about it without defining it... (10 Replies)
Discussion started by: Straitsfan
10 Replies

4. Programming

read from file using forks

i'm tring to make 2 processes each read from the same file but only one of them read the file. FILE * fileptr1; fileptr1 = fopen("file1.txt","rt"); pid2=fork(); while(1) { fscanf(fileptr1,"%s",temp1); if(feof(fileptr1)==0) { printf("%i",getpid()); //id of current process ... (6 Replies)
Discussion started by: ddx08
6 Replies

5. Programming

multiple forks and printf question

Hello *NIX gurus, I have a slight perplexing problem with multiple forks giving different results... Here is the deal. From what I undestand, a fork() call starts executing from the next instruction that follows the fork() call. That means it inherits the PC counter register value of the... (4 Replies)
Discussion started by: navigator
4 Replies

6. UNIX for Advanced & Expert Users

Question on forks and pipes

I am trying to figure out why when i have the following code int main( { printf("0\n"); fork(); printf("1\n"); exit(0);} and type in the shell a.out | cat the output of this program is 0 1 0 1 instead of 0 1 1 does anyone know? (3 Replies)
Discussion started by: Phantom12345
3 Replies

7. Programming

forks, ipc, fifos, update issues...

Hi, so I've got this program("main") that fork executes another ("user"). These programs communicate through fifos. One communication is a spawn call, where user passes an executable, main forks and executes it. So, I'm keeping track of all my processes using a task table. After the fork (for... (6 Replies)
Discussion started by: Funktar
6 Replies

8. UNIX for Advanced & Expert Users

forks....HELP!!! someone anyone?

Hey guys, I'm given this bit of code, but, I'm having some problems executing it with the functions I've defined so far. I'm suppose to define the funtions "parse" and "execute." Parse splits the command in buf into individual arguments. It strips whitespace, replacing those it finds with NULLS... (3 Replies)
Discussion started by: richardspence2
3 Replies

9. UNIX for Advanced & Expert Users

possibility to call subprocesses from ksh ??

Hi!! Is there a possibility to call/start a subproces using ksh ?? Hope that there is somebody to help me. thanks in advance. Corine (3 Replies)
Discussion started by: TheBlueLady
3 Replies
Login or Register to Ask a Question