Dsh command - shell script - sys args?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Dsh command - shell script - sys args?
# 1  
Old 06-13-2014
Dsh command - shell script - sys args?

Sorry, a noobie question....!

I want to use a linux cluster to copy a list of files. I want to split the processing over 3 nodes so that each node gets (more or less) an equal share.

My script (base.sh) to execute my copy script (copy.sh) looks something like:
Code:
#!/bin/bash

for NODE in 1 2 3
do
        /sw/egs/bin/dsh -w node0${NODE} -e "/home/ts/scripts/copy.sh"

done

My copy.sh file is:

Code:
#/bin/bash

IDIR="/home/ts/test2/temp"
ODIR="/home/ts/test3/
FARRAY=( "$IDIR"/*.R )
COUNT=${#FARRAY[@]}

THIS_NODE=
TOTAL_NODES=3


for i in `seq 1 $COUNT`
do

        THIS_FILE=${FARRAY[$i]}
        REM=`expr $i % $TOTAL_NODES`


        if [ $REM -eq 0 ]
        then
        $REM = $TOTAL_NODES
        fi
    

        if [ $REM -eq $THIS_NODE ]
        then 
                cp $THIS_FILE $ODIR
        fi

done

My questions:

1. How do I capture which node is running this job (THIS_NODE) in copy.sh?
2. How can I modify the base.sh script so that the total number of nodes can also be passed into the copy.sh script? (sys args? - how?)

Is there a better/shorter/sleeker way to do what I am doing? Any other suggestions?

thanks!
# 2  
Old 06-13-2014
Quote:
Originally Posted by pc2001
Sorry, a noobie question....!

I want to use a linux cluster to copy a list of files. I want to split the processing over 3 nodes so that each node gets (more or less) an equal share.
Let's stop and think about this. Three CPU's can run programs faster, because they can run independently of each other. But how exactly are three CPU's going to speed up your disk? It doesn't matter if your CPU's can send disk-read commands faster than light, the disk can only do so much, and One CPU is going to max that out. Three may start it trashing -- slowing it down.

Additionally -- that you're having problems copying lots of files fast enough tells me there may be another problem here, like millions of files crammed in one folder, which multithreading cannot solve either.

There are a few very specific circumstances where multiple threads may speed this up -- a NAS with independent dedicated links, or some odd kinds of software RAID -- but I don't consider this likely without being told.

Back up and tell me more about your system and the problem you are trying to solve, please.

Last edited by Corona688; 06-13-2014 at 03:09 PM..
# 3  
Old 06-13-2014
Ha, quite right.

I just use copy as an example. Instead of copy, each file will be subjected to some processing and some output from this will be written to disk.

Sorry, I should have mentioned...
This User Gave Thanks to pc2001 For This Post:
# 4  
Old 06-13-2014
My apologies. I was having flashbacks to a thread dealing with ten million files in one directory; the OP refused to believe his disk couldn't be multithreaded via some magic perl or python code... Smilie Now that that's out of the way!

Do all three machines share the same disk? If not, dsh won't be useful here!

1) Your loops are overcomplicated. Instead of
Code:
ARR=( whatever ) ; for i in `seq ...`

do
Code:
for FILE in whatever/*
do
        ...
done

2) for FILE in whatever/*, or shell globbing in general, will fail with 'too many arguments' when there are large numbers of files. Better to use a utility like ls or find and print to a pipe when you dont' know how many files there are.

3) Don't have your sub-programs check which files are "theirs" -- tell them which files are theirs. Feed them into the program so they don't have to guess. This avoids problems with them getting out of sync (if the folder has a file added to the dir before one runs and after another runs, for example).

Code:
#!/bin/bash

NODES=3
N=0

# If there are thousands of files, '*.R' will fail in the shell with 'too many arguments'.
# So we use find instead, which prints to a pipe, avoiding using
# arguments at all.
find /home/ts/test2/temp/ -mindepth 1 -maxdepth 1 -type f -name '*.R' | while read FILE
do
        echo "file $FILE goes to node $N"
        echo "$FILE" >> /tmp/$$-$N
        let N=(N+1)%3
done

for ((N=0; N<NODES; N++))
do
        # I am assuming dsh can read from standard input here.
        # If this is wrong, that wont' work :(
        /sw/egs/bin/dsh -w node0${N} -e "/home/ts/scripts/copy.sh" < /tmp/$$-$N &
        rm /tmp/$$-$N
done

wait

Code:
#!/bin/bash

ODIR="/home/ts/test3/

while read FILE
do
        echo "Got file $FILE"
        echo cp "$FILE" "$ODIR"
done

This User Gave Thanks to Corona688 For This Post:
# 5  
Old 06-16-2014
Thanks for the detailed explanation, Corona. I've learnt a lot!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Fedora

Dsh command help!

Hi, I am trying to run the following on our cluster: /sw/bin/dsh -w node02 -e "/home/pc/thello.sh" but I get an error: dsh: no machine specified I used to be able to run it a few months back, but they seem to have reinstalled the OS or something.. The output of the dsh -a... (2 Replies)
Discussion started by: pc2001
2 Replies

2. UNIX for Dummies Questions & Answers

Edit $args within a command

Hi, I'm using a while loop for a given command "bowtie2" with several parameters. mkdir clean paste <(ls --quote-name ./qc/sg_*_R1_val_1.fq.gz) <(ls --quote-name ./qc/sg_*_R2_val_2.fq.gz) |sed 's/"./-1 ./' | sed 's/gz"\t/gz\t -2 /' | sed 's/"//g' |\ while read args ; do ... (5 Replies)
Discussion started by: sargotrons
5 Replies

3. Shell Programming and Scripting

Script as login shell (passing args to login shell)

Hello all, for security reasons my compagny imposes that my script be launch remotly via ssh under the users login shell. So serverA launches the ssh command to serverB which has a local user with my script as a login shell. Local script works like a charm on his own. serverB$ grep... (20 Replies)
Discussion started by: maverick72
20 Replies

4. Shell Programming and Scripting

Store args passed in array but not the first 2 args

Store args passed in array but not the first 2 args. # bash declare -a arr=("$@") s=$(IFS=, eval 'echo "${arr}"') echo "$s" output: sh array.sh 1 2 3 4 5 6 1,2,3,4,5,6 Desired output: sh array.sh 1 2 3 4 5 6 3,4,5,6 (2 Replies)
Discussion started by: iaav
2 Replies

5. UNIX for Dummies Questions & Answers

Dsh command : Execution Problems with Cron

Hi, On linux cluster, i created a script to delete all temp files older than 5 days. i am able to execute the script "dsh -ea script.ksh" in management node directly But when i schedule "dsh -ea script.ksh" in crontab in management node it tells dsh command not found. How to solve... (2 Replies)
Discussion started by: smartrajusid
2 Replies

6. Shell Programming and Scripting

problem with KSH script: command line args

Hi I am executing a KSH script by passing command line arguments example: Red Green Dark Red Blue when I am splitting the arguments by using " "(Space) as delimiter But the colour Dark Red is a single parameter. But it is getting splitted in between How to avoid this. Please help Also... (4 Replies)
Discussion started by: hemanth424
4 Replies

7. Shell Programming and Scripting

C Shell Scripting - HELP! - checking total args in a script

Hi, I 'm trying to learn the scripting language and am trying to create a script to open a C Program, allow the user to edit it, and then run it. What I have works but only when you enter the name to be compiled and the c program, but what if you only entered the 1 argument (cprogram.c) ? but I 'm... (3 Replies)
Discussion started by: patel_ankz
3 Replies

8. UNIX for Dummies Questions & Answers

command line args 2

I have this while loop and at the end I am trying to get it to tell me the last argument I entered. And with it like this all I get is the sentence with no value for $1. Now I tried moving done after the sentence and it printed the value of $1 after every number. I don't want that I just want... (2 Replies)
Discussion started by: skooly5
2 Replies

9. UNIX for Dummies Questions & Answers

command line args

I am trying to print command line arguments one per second. I have this while do echo "6" shift echo "5" shift echo "4" shift echo "3" shift echo "2" shift echo "1" shift done (2 Replies)
Discussion started by: skooly5
2 Replies

10. Programming

Command line args

My program usage takes the form for example; $ theApp 2 "one or more words" i.e. 3 command line arguments; application name, an integer, some text My code includes the following 4 lines: int anInteger; char words; sscanf(argv, "%d", &anInteger); sscanf(argv, "%s", &message); Based... (2 Replies)
Discussion started by: enuenu
2 Replies
Login or Register to Ask a Question