all,
i've been reading to try and get an abstract idea of the process effeciency of commands , sed, bash, perl, awk, find, grep, etc
which processes will spawn?, fork?, launch subshell?, etc and under what conditions?
how do you know which commands have the faster and better stdio implementation?
and so am looking for some guru advice instead of running thousands of use cases for different configurations.
example: finding a specific line in a multiple files spanning a volume
i can use something like this
which seems very fast for searching many(60,000+) of files <10 kb ascii, UTF-8 but one could also use
which seems fairly fast as well, one could also probably come up with a few one liners in perl.
sed, bash, perl, awk, find, grep are all processes. A subshell is a process. A fork is a fork is a fork.
Whether any of these are faster or slower than other ways to solve your problem, really depends on your problem, and the algorithm you use to solve it. So "one solution to solve everything, forever" may be out the window.
There's some cardinal sins to avoid:
Don't reprocess the same file repeatedly. You can almost always do everything in one pass that you could do in two.
Don't launch whole processes to process tiny amounts of data.echo "a b c" | awk '{ print $1 }' is a tragic waste, this is when shell builtins would be thousands of times more efficient.
Running your innermost loop in the shell will be slow. A while read loop line by line over a file will be slower than awk '{ something }' filename. Shell is for the high level things, not the nitty gritty bulk work. This is when externals would be thousands of times more efficient.
If you're doing cat | awk | sed | cut | tr | kitchen | sink, put it all in one awk. awk is a programming language which is capable of replacing all of these with some near-trivial code, and one awk call will be faster than ten anything else.
Useless Use of Cat. Don't do that. Nothing needs a cat | in front of it to read a file.
sed, bash, perl, awk, find, grep are all processes. A subshell is a process. A fork is a fork is a fork.
Whether any of these are faster or slower than other ways to solve your problem, really depends on your problem, and the algorithm you use to solve it. So "one solution to solve everything, forever" may be out the window.
There's some cardinal sins to avoid:
Don't reprocess the same file repeatedly. You can almost always do everything in one pass that you could do in two.
Don't launch whole processes to process tiny amounts of data.echo "a b c" | awk '{ print $1 }' is a tragic waste, this is when shell builtins would be thousands of times more efficient.
Running your innermost loop in the shell will be slow. A while read loop line by line over a file will be slower than awk '{ something }' filename. Shell is for the high level things, not the nitty gritty bulk work. This is when externals would be thousands of times more efficient.
If you're doing cat | awk | sed | cut | tr | kitchen | sink, put it all in one awk. awk is a programming language which is capable of replacing all of these with some near-trivial code, and one awk call will be faster than ten anything else.
Useless Use of Cat. Don't do that. Nothing needs a cat | in front of it to read a file.
Many thanks. This is what I was looking for, general rule of thumb.
I'd like to terminate scripts.
These scripts are not by me, but other persons. These contain other programs such as ping, nmap, arp, etc.
If such a script is running, how can I terminate this script AND all processes called by this script?
These scripts are big so terminating all programs... (4 Replies)
I mean like this: http://shells.red-pill.eu/
Can anyone explain how this works? I hope my post is not spam. I think its related to linux. Thank you (1 Reply)
I'm going through my UNIX book and came across a section on Customization and Subprocesses.
Can someone tell me what a subprocess is -- for example, when the book says "Which shell 'thing' are known to subprocesses" what exactly does it mean? The book just talks about it without defining it... (10 Replies)
i'm tring to make 2 processes each read from the same file but only one of them read the file.
FILE * fileptr1;
fileptr1 = fopen("file1.txt","rt");
pid2=fork();
while(1)
{
fscanf(fileptr1,"%s",temp1);
if(feof(fileptr1)==0)
{
printf("%i",getpid()); //id of current process ... (6 Replies)
Hello *NIX gurus,
I have a slight perplexing problem with multiple forks giving different results... Here is the deal.
From what I undestand, a fork() call starts executing from the next instruction that follows the fork() call. That means it inherits the PC counter register value of the... (4 Replies)
I am trying to figure out why when i have the following code
int main( { printf("0\n"); fork(); printf("1\n"); exit(0);}
and type in the shell
a.out | cat
the output of this program is
0
1
0
1
instead of
0
1
1
does anyone know? (3 Replies)
Hi, so I've got this program("main") that fork executes another ("user"). These programs communicate through fifos.
One communication is a spawn call, where user passes an executable, main forks and executes it. So, I'm keeping track of all my processes using a task table. After the fork (for... (6 Replies)
Hey guys,
I'm given this bit of code, but, I'm having some problems executing it with the functions I've defined so far. I'm suppose to define the funtions "parse" and "execute." Parse splits the command in buf into individual arguments. It strips whitespace, replacing those it finds with NULLS... (3 Replies)