02-23-2010
I'm trying to achieve the following goals:
1. Only read the log files from disk once.
2. Only decompress the log files once.
3. Process the log files with multiple parallel processes.
4. wait for them all to complete
5. do something else.
The reason being is that disk io and decompression are expensive in comparison to log processing, so I only want to do them once. And running the processing in parallel leverages my quad core processor to further reduce run time.
I think I have what I need, but I'm open to any other suggestions for how to do this better.
With background jobs method you suggest I would have to :
1. create an array of 6 unique temporary pipe names
2. mkfifo for each of the 6 pipes
3. start sending the decompressed logs to 6 named pipes using tee
4. launch a the 6 processing pipelines in the background telling them to read from the named pipes.
5. Now I could use wait to wait for them all.
6. rm all the temporary pipes
7. continue my script
This would work too, but is this better than using 'tee >(cmd1) >(cmd2)...>(cmd6)' and waiting for them to echo something into one named pipe that I have to manually manage? At least 'tee' takes care of creating and destroying the named pipes for me.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I'm using PERL on windows NT to try to run an extract of data. I have multiple zip files in multiple locations. I am extracting "*.t" from zip files and subsequently adding that file to one zip file so when the script is complete I should have one zip file with a whole bunch of ".t" files in it.
... (2 Replies)
Discussion started by: dangral
2 Replies
2. Shell Programming and Scripting
I am attempting within a for-loop, to have my shell script (Solaris v8 ksh) wait until a copy file command to complete before continueing. The specific code is:
for files in $(<inputfile.lst)
do
mv directory/$files directory/$files
ksh -m -i bg %%
wait $!
done
I am shaky on the... (3 Replies)
Discussion started by: gozer13
3 Replies
3. Shell Programming and Scripting
Does anyone have an example of a korn shell scripts kicking of multiple background processes and then using the wait command to get the return code from those processes?
I want to write a program that kicks off multiple Oracle procedures and then wait for the return code before I procede.... (1 Reply)
Discussion started by: lesstjm
1 Replies
4. Programming
As far as I can tell, the bash wait command waits for a logical "AND" of all the child processes.
Assuming I am coding in C:
(1) What is the function I would use to create multiple bash child process running perl?
(2) What is the function I would use to reinvent the bash wait command so I... (4 Replies)
Discussion started by: siegfried
4 Replies
5. Shell Programming and Scripting
Did not use 'wait' yet.
How I understand by now the wait works only for child processes, started background.
Is there any other way to watch completion of any, not related process (at least, a process, owned by the same user?)
I need to start a background process, witch will be waiting... (2 Replies)
Discussion started by: alex_5161
2 Replies
6. Filesystems, Disks and Memory
Hi All,
Am finding performance of my SD card using hdparm.
hdparm -tT /dev/BlockDev0
/dev/BlockDev0:
Timing cached reads: 1118 MB in 2.00 seconds = 558.61 MB/sec
HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate
ioctl for device
Timing buffered disk reads: 14... (0 Replies)
Discussion started by: amio
0 Replies
7. Shell Programming and Scripting
Hi, Is there any way to know the child process status as and when it finished. If i write like below
nohup sh a1.sh & ### has sleep 20 ;echo a1.sh
nohup sh a2.sh & ### has sleep 10 ;echo a2.sh
nohup sh a3.sh & ### has sleep 5 ;echo a3.sh
wait
This will wait till a1.sh ,a2.sh a3.sh... (0 Replies)
Discussion started by: patrickk
0 Replies
8. Shell Programming and Scripting
Let's say I start process A.sh, then start process B.sh. I call both of them in my C.sh
How can I make sure that B starts its execution only after A.sh finishes.
I have to do this in loop.Execution time of A.sh may vary everytime.
It is a parameterized script. (17 Replies)
Discussion started by: rafa_fed2
17 Replies
9. Shell Programming and Scripting
Hi All,
I have a question related to Shell scripting. In my shell script, I have following two commands in sequence:
sed 's/^/grep "^120" /g' $ORIGCHARGEDAMTLIST|sed "s;$;| cut -f$FIELD_NO1 -d '|' | awk '{ sum+=\$1} END {printf (\"%0.2f\\\n\", sum/100)}' >$TEMPFILE
mv $TEMPFILE $ORIGFILE... (3 Replies)
Discussion started by: angshuman
3 Replies
10. Shell Programming and Scripting
Hello,
im having bash script with
while ***
command1 &&
command2 &&
command3 &&
done
i want to ask how i can prevent overloading server, by waiting untill all commands complete? any low resources intensive command like "wait" - i dont know if exist? (2 Replies)
Discussion started by: postcd
2 Replies
TEE(2) Linux Programmer's Manual TEE(2)
NAME
tee - duplicating pipe content
SYNOPSIS
#define _GNU_SOURCE
#include <fcntl.h>
ssize_t tee(int fd_in, int fd_out, size_t len, unsigned int flags);
DESCRIPTION
tee() duplicates up to len bytes of data from the pipe referred to by the file descriptor fd_in to the pipe referred to by the file
descriptor fd_out. It does not consume the data that is duplicated from fd_in; therefore, that data can be copied by a subsequent
splice(2).
flags is a series of modifier flags, which share the name space with splice(2) and vmsplice(2):
SPLICE_F_MOVE Currently has no effect for tee(); see splice(2).
SPLICE_F_NONBLOCK Do not block on I/O; see splice(2) for further details.
SPLICE_F_MORE Currently has no effect for tee(), but may be implemented in the future; see splice(2).
SPLICE_F_GIFT Unused for tee(); see vmsplice(2).
RETURN VALUE
Upon successful completion, tee() returns the number of bytes that were duplicated between the input and output. A return value of 0 means
that there was no data to transfer, and it would not make sense to block, because there are no writers connected to the write end of the
pipe referred to by fd_in.
On error, tee() returns -1 and errno is set to indicate the error.
ERRORS
EINVAL fd_in or fd_out does not refer to a pipe; or fd_in and fd_out refer to the same pipe.
ENOMEM Out of memory.
VERSIONS
The tee() system call first appeared in Linux 2.6.17.
CONFORMING TO
This system call is Linux-specific.
NOTES
Conceptually, tee() copies the data between the two pipes. In reality no real data copying takes place though: under the covers, tee()
assigns data in the output by merely grabbing a reference to the input.
EXAMPLE
The following example implements a basic tee(1) program using the tee() system call.
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <limits.h>
int
main(int argc, char *argv[])
{
int fd;
int len, slen;
if (argc != 2) {
fprintf(stderr, "Usage: %s <file>
", argv[0]);
exit(EXIT_FAILURE);
}
fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0644);
if (fd == -1) {
perror("open");
exit(EXIT_FAILURE);
}
do {
/*
* tee stdin to stdout.
*/
len = tee(STDIN_FILENO, STDOUT_FILENO,
INT_MAX, SPLICE_F_NONBLOCK);
if (len < 0) {
if (errno == EAGAIN)
continue;
perror("tee");
exit(EXIT_FAILURE);
} else
if (len == 0)
break;
/*
* Consume stdin by splicing it to a file.
*/
while (len > 0) {
slen = splice(STDIN_FILENO, NULL, fd, NULL,
len, SPLICE_F_MOVE);
if (slen < 0) {
perror("splice");
break;
}
len -= slen;
}
} while (1);
close(fd);
exit(EXIT_SUCCESS);
}
SEE ALSO
splice(2), vmsplice(2), feature_test_macros(7)
COLOPHON
This page is part of release 3.25 of the Linux man-pages project. A description of the project, and information about reporting bugs, can
be found at http://www.kernel.org/doc/man-pages/.
Linux 2009-09-15 TEE(2)