Processing a file list via named pipe


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Processing a file list via named pipe
# 1  
Old 05-17-2013
Processing a file list via named pipe

I have a ksh93 script I use that processes a file list in the order that they exist in the list. I would like to speed up processing of the list by having multiple processes handle it at once. I was thinking that perhaps a good way to handle this would be to write the list to a named pipe and some number of processes that I can specify would be created and read from the named pipe the file it could work on and when finished with that file it would again read the named pipe until nothing is left. Each process would read the named pipe after finishing processing a file. The files then would be processed close to the order they existed in the file list which I need done.

My only consideration is regarding how to stop the processes from then hanging waiting for more input from the named pipe when there are no more files left to process. Would the best way to handle that be to write several times some kind of exit flag to the named pipe so that the processes as they read the next file from the list will instead encounter the exit flag and know to exit or is there a better way?
# 2  
Old 05-18-2013
Usually named pipes would be used between two processes. The problem I see with this approach is that named pipes are byte oriented. So while you are writing the line oriented file list to the named pipe, the processes will be snatching the bytes off the named pipe and once it is gone, it is gone.

So in other words, those parallel reading processes would garble those file names, and so you would need to use some kind of signaling to synchronize the read operations so that it can only be performed by one process at the time..

Another approach might be to use one named pipe per reading process.
# 3  
Old 05-18-2013
Hi.

Perhaps consider xargs, probably already available on your system, or GNU Parallel - GNU Project - Free Software Foundation

Best wishes ... cheers, drl
# 4  
Old 05-21-2013
Well, a small test quickly showed me that a named pipe would not be an easy solution. As long as one process has the pipe open, another process gets the message "ksh: fifo: cannot open [Device or resource busy]" which doesn't help me any.

I might have to resort to xargs though I prefer to keep this as a single script and xargs won't work with functions.

Another idea perhaps is to use several background processes, but keep track of their PIDs and then every second or so I could do a ps into awk to return the number of processes still running and if it is less than the number I expect, run that many processes up to the maximum I want running until no more files remain in the list. I will see about messing around with this solution though please feel free to direct me towards something better. - Thanks.
# 5  
Old 05-21-2013
Hi.

Functions will not work (see output), but one can create an embedded script:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate xargs and parallel, succeed with embedded script.

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
yoohoo() { for _i;do printf "%s" "$_i";done; printf "\n"; }
export -f yoohoo
C=$HOME/bin/context && [ -f $C ] && $C xargs parallel

FILE=${1-data1}

pl " Input file $FILE:"
head $FILE

pl " Results, default:"
cat $FILE |
xargs -L 1

yoohoo
yoohoo " ( Define helper script )"
cat <<"EOF" > s1-helper
#!/usr/bin/env bash
echo " PID = $$; args = $*"
EOF
chmod +x s1-helper
head s1-helper

pl " Results, external script, xargs:"
cat $FILE |
xargs -L 1 ./s1-helper

pl " Results, external script, parallel:"
cat $FILE |
parallel ./s1-helper

pl " Expecting functions to fail."
pe " Why? http://www.perlmonks.org/index.pl?node_id=484296"

pl " Results, function:"
cat $FILE |
xargs yoohoo

pl " Results, function:"
cat $FILE |
parallel yoohoo

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
xargs (GNU findutils) 4.4.0
parallel GNU parallel 20111122

-----
 Input file data1:
foo
bar
baz

-----
 Results, default:
foo
bar
baz

 ( Define helper script )
#!/usr/bin/env bash
echo " PID = $$; args = $*"

-----
 Results, external script, xargs:
 PID = 12328; args = foo
 PID = 12329; args = bar
 PID = 12330; args = baz

-----
 Results, external script, parallel:
 PID = 12391; args = foo
 PID = 12411; args = bar
 PID = 12431; args = baz

-----
 Expecting functions to fail.
 Why? http://www.perlmonks.org/index.pl?node_id=484296

-----
 Results, function:
xargs: yoohoo: No such file or directory

-----
 Results, function:
yoohoo: Command not found.
yoohoo: Command not found.
yoohoo: Command not found.

This was with bash, but running with ksh 93s+ (removing export -f) produced essentially the same output.

Best wishes ... cheers, drl

Last edited by drl; 05-21-2013 at 05:40 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to test named pipe file?

Hi ALL, How can I test a given file name exists and if it is a named pipe file in shell script ? Thanks............ (2 Replies)
Discussion started by: mycode.in
2 Replies

2. UNIX for Dummies Questions & Answers

Named pipe hanging?

Ok, I can't seem to figure this out or find anything on the web about this. I'm on Sun Solaris, UNIX. I have the following test script: #!/bin/ksh touch test.file LOG=./tmp.log rm -f ${LOG} PIPE=./tmp.pipe mkfifo ${PIPE} trap "rm -f ${PIPE}" EXIT tee -a ${LOG} < ${PIPE} & ... (17 Replies)
Discussion started by: Ditto
17 Replies

3. Shell Programming and Scripting

Named pipe performance

Hi, I am getting data into a Named pipe. Does Named pipe have any size restriction; I know it does not have any storage and it just passes on the data to the next process. I want to know, if there will be a difference in the Named pipe performance if the data input is more. (I am using DB2... (1 Reply)
Discussion started by: sudvishw
1 Replies

4. Shell Programming and Scripting

Using Named pipe in shell script

Hi, I want to use a Named pipe to get input from a growing file for further processing. When I prototype this scenario using a while loop, the data is not written to the named pipe. This the script I use to get data into the Named pipe: #!/bin/ksh mkfifo pipe while (( n <= 10 )) do echo... (2 Replies)
Discussion started by: sudvishw
2 Replies

5. Shell Programming and Scripting

pipe to file named with date

I would like to pipe (redirect ? - what is the right term?) the output of my script to a file named with the current date. If I run this at a command prompt: date +'%Y%m%d" ...it returns "20110429" OK, that's good... so I try: ./script.sh > "'date +%Y%m%d'.csv" I get a file... (1 Reply)
Discussion started by: landog
1 Replies

6. UNIX for Dummies Questions & Answers

Filtering mail into a named pipe

Hello, On my machine, all mail is stored in my /var/spool/mail. IS there a way to direct all mail that goes there into a namep pipe? Thank you, Dado (4 Replies)
Discussion started by: dadoprso
4 Replies

7. UNIX for Dummies Questions & Answers

Named Pipe contents to a file

I want to copy the contents of a named pipe to a file. I have tried using: cat pipe.p >> transcript.log but I have been unsuccessful, any ideas? (4 Replies)
Discussion started by: carl_vieyra
4 Replies

8. UNIX for Dummies Questions & Answers

Named PIPE

Gurus, I've a File Transaction Server, which communicates with other servers and performs some processing.It uses many Named PIPE's. By mistake i copied a named PIPE into a text file. I heard that PIPE files shouldn't be copied.Isn't it? Since it's a production box, i'm afraid on... (2 Replies)
Discussion started by: Tamil
2 Replies

9. Programming

IPC using named pipe

Hi All, I am facing a vague issue while trying to make two process talk to each other using named pipe. read process ========= The process which reads, basically creates FIFO using mkfifo - ret_val = mkfifo(HALF_DUPLEX, 0666) func. It then opens the pipe using open func - fd = open... (2 Replies)
Discussion started by: sharanbr
2 Replies

10. UNIX for Advanced & Expert Users

IPC using named pipe

Hi All, I am facing a vague issue while trying to make two process talk to each other using named pipe. read process ========= The process which reads, basically creates FIFO using mkfifo - ret_val = mkfifo(HALF_DUPLEX, 0666);) func. It then opens the pipe using open func - fd =... (1 Reply)
Discussion started by: sharanbr
1 Replies
Login or Register to Ask a Question