BASH Execution Delay / Speedup


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting BASH Execution Delay / Speedup
# 15  
Old 01-27-2015
Quote:
Originally Posted by gmark99
I apologize, Don. I don't provide all the code because I thought it would obfuscate things, but it seems I've made things more complicated. I really appreciate your patience, here.
I also apologize. I should have gone to bed at midnight this morning instead of trying to help you with your problem. I completely overlooked line #10 in your code that wipes out the data that you have just copied (and occasionally another one or more chunks of data that have been added to the file between the time the cp on the previous line completes and the time the redirection wipes out the file you copied, which with 400 jobs running on your system could be minutes later.
Quote:
First, I run "ps" and look for instances of "my_job" and have a maximum number that's checked before spawning another. I've run as many as 400 to stress things, and it worked (with the exception being the problem I'm talking about here, which doesn't seem to be affected at all by that number). I currently run a maximum of 20, but at this instant, for debugging purposes, I've set the limit at one. There is some proprietary stuff inside "my_job" that I'm hesitant to show (yes, I understand how difficult that makes this!)
If you mean that you run ps somewhere in the 1st three lines of your script (which you stripped out of the code you showed us), that won't have any effect on the number of jobs started in the background on line 19 in the loop on lines 5 through 21.

If you mean that you run ps in my_job, that won't affect the number of jobs started in the background on line 19 in your script nor the speed with which they are spawned.

If you mean that you have another loop between lines 18 and 19 in the code you showed us that keeps you from getting to line 19 until some of your background jobs complete, that would be CRUCIAL information that completely changes the way your script works that you have hidden from us.

From what you have shown us, the only thing limiting the number of invocations of my_job that you try to run concurrently is the number of lines available to process in your input file and how fast your "producer" can write data into that file.
Quote:
As for reading the files continuously, the source of data is always on, populating the source text file. I copy the file over, erase the source copy, and then read each line until a counter exceeds the line size of the file, OR if the last line I read has a timestamp that is too old.
As I mentioned above, the way you are copying and erasing the source file will sometimes silently discard some data. But, if you discard data if it is too old (something else we can't see in your code) maybe it doesn't matter
Quote:
As for your suggestion to read the file a single time, yes, I used that successfully and just switched back with the suspicion that that method (the method you recommend here) was causing my current problem.
I can assure you that that wasn't your problem unless the problem was that you ran out of disk space due to the size of the file or you exceeded the maximum file size that could be written by the process that is adding data to your source file. (And the description of the symptoms you have provided do not support either of these possibilities.)
Quote:
Exit status: It executes an "exit 0" on success or failure, but results are all echoed to a log file. Failures I check are all for functions that read or write data to and from hardware, but I still exit 0, and simply report the results of those functions.
You tell us that you limit the number of jobs you are running simultaneously, but you don't show us any code that suggests that this is true. From what you have shown us, there is a high likelihood that attempts to spawn my_job in the background will fail do to exceeding the number of processes a user is allowed to run at once. Since you never wait for any of your background jobs to complete and never check the status of any of your background jobs, you will never know how many attempts to start my_job failed (and in these cases, my_job can't possibly log the fact that it never started).
Quote:
Would you suggest waiting until the process count (of running "my_job" instances) dropped to some lower number or perhaps zero before fetching a new file full of records?
You have ignored my requests for information about the type of system you're using and the number of threads you might be able to run concurrently. Unless you have a massively parallel processing system, running 400 background jobs is much more likely to cause thrashing and scheduling problems than it is likely to improve throughput.

What you have shown us is logically equivalent to a script like this:
Code:
while [ true ]
do      sleep 1&
done

which will bring any system to its knees in seconds.
Quote:
Thanks again!!
Mark
These 2 Users Gave Thanks to Don Cragun For This Post:
# 16  
Old 01-29-2015
BASH Execution Delay / Speedup

Hope this helps, Don

Let me know what else I can provide, such as more of what these called programs contain. Essentially, the "my_job" runs for up to 90 seconds maximum and then deposits whatever results in its own file for retrieval by the "unloader".
The Garbage Collection routine periodically checks for "heartbeat" files that haven't been updated in several minutes, tries to kill the number of the process inside it, if its still alive, and then discards the file.

Code:
#!/bin/bash
_last_update="Tue Jan 14 12:39:23 CST 2015"
# Linux 2.6.32-431.29.2.el6.x86_64 #1 SMP Sun Jul 27 15:55:46 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
#
    COL_C_TIMESTAMP=1
    COL_C_COMMAND=2

###############################################################
# 3. INSTALLATION-DEPENDENT VARIABLES
###############################################################

### Place on WFE where all this cool stuff happens
# WFE_HOME=`pwd`
WFE_HOME=/home/gmark/rje

### Place on Splunk server where other cool stuff happens
# Server where WFE is running
WFE_SERVER=wfe.ready.com

WFE_CONTROL=${WFE_HOME}/op-control

### Message Buffer directory
WFE_MSGS=${WFE_HOME}/MSGS

# Scheduled Global Abate done today already?
SGA_STAT_FILE=${WFE_HOME}/wfe-sga-donetoday
echo NOT_DONE > ${SGA_STAT_FILE}

### Archive file of CSV commands for SIMULATOR
CSV_INPUT_ARCHIVE_FILE=${WFE_HOME}/csv-command-archive-file

# Common name of CSV file on both systems
CSV_NAME=work.csv

# Type of command used to transfer files
XCOMMAND=sftp

# Path to Heartbeat Timestamp file
HEARTBEAT_FILE=${WFE_HOME}/wfe-heartbeat
touch ${HEARTBEAT_FILE}

###############################################################
# 4. MASTER PROCESS CONTROL FILE READ
###############################################################
#

### WFE ROP used to log debug and for status information
WFE_2_ROP=${WFE_HOME}/wfe-ropfile

### WFE Logfile used for communication to Splunk
WFE_2_SPLUNK=${WFE_HOME}/wfe-logfile

# WFE Process ID used to enforce single System Process
WFE_PID_FILE=${WFE_HOME}/wfe-process-id

# Initialize Process ID to Enforce threading requirements
THIS_WFE_PID=$$
echo ${THIS_WFE_PID} > ${WFE_PID_FILE}

# Initial index of records in local CSV input file
CMD_INPUT_POINTER=9999999

# Initial size of Local CSV Command Buffer
LOCAL_CSV_SIZE=0

# Initial Assumed Oldest ALERT Timetstamp
LOCAL_CSV_BIRTHDAY=0
CALC_TIMESTAMP=`date "+%s"`;

###############################################################
# 5. CHECK CLONE STATUS
###############################################################
#
while [ true ]
do
    MASTER_WFE_PID=`cat ${WFE_PID_FILE}`
    if [ ${THIS_WFE_PID} != ${MASTER_WFE_PID} ]
    then
        echo "...`date "+%Y-%m-%d %H:%M:%S"`: Execution Stopped ..." >> ${WFE_2_ROP};
        exit 0
    fi

###############################################################
# 6. WORK TO DO?  IF NOT, GET SOME.
###############################################################

    wfe_msg_unloader &

    wfe_garbage_collection &

    NOW_TIME=`date "+%s"`;
    CSV_AGE=$(( ${NOW_TIME} - ${LOCAL_CSV_BIRTHDAY} ))

    # Out of ALERTS? MOVE CSV from Splunk to WFE - purge any aging ALERTS
    if [ ${CMD_INPUT_POINTER} -ge ${LOCAL_CSV_SIZE} -o \
        ${CSV_AGE:=0} -gt ${MAX_ALERT_REQ_AGE} ]
    then
        touch ${HEARTBEAT_FILE}
        cat /home/gmark/rje/COMMANDS.csv | grep "A[BL][AE]" > ${LOCAL_CSV}
        > ${REMOTE_CSV}
        LOCAL_CSV_SIZE=`wc -l ${LOCAL_CSV} | sed "s;^ *;;" | sed "s; .*;;"`
        LOCAL_CSV_BIRTHDAY=${NOW_TIME}
        CMD_INPUT_POINTER=0
    fi

    while read CMD_INPUT
    while [ true ]
    do


###############################################################
# 8. VERIFY RUN STATUS, ELSE RESET NOW TIMER
###############################################################
 
        touch ${HEARTBEAT_FILE}

# An external "control" file with RUN=YES or RUN=NO to turn this off
        RUN=`wfe_set_control RUN YES`
        if [ ${RUN} != YES ]
        then
            echo "... ${NOW_TIME}: RUN=${RUN}: Execution Stopped by Request ..." >> ${WFE_2_ROP}
            exit 0
        fi

        CMD_INPUT_POINTER=$(( ${CMD_INPUT_POINTER} + 1 ))

        NOW_TIME=`date "+%s"`;
        CSV_AGE=$(( ${NOW_TIME} - ${LOCAL_CSV_BIRTHDAY} ))

        if [ ${CSV_AGE:=0} -gt ${MAX_ALERT_REQ_AGE} ]
        then
            > ${LOCAL_CSV}
        fi


# This allows "read" statements to be placed in the loop for debugging
        CMD_INPUT=`cat ${LOCAL_CSV} | head -${CMD_INPUT_POINTER} | tail -1`

        touch ${HEARTBEAT_FILE}

	# Better ways to do this, but none as dependable
        C_COMMAND="`echo ${CMD_INPUT} | cut -d, -f${COL_C_COMMAND}`"

        if [ ${C_COMMAND}x == ALERTx -o ${C_COMMAND}x == ABATEx ]
        then
            echo ${CMD_INPUT} >> ${CSV_INPUT_ARCHIVE_FILE}
            C_TIMESTAMP="`echo ${CMD_INPUT} | cut -d, -f${COL_C_TIMESTAMP}`"

# Heartbeat file checked by another process to make sure this is still running
            touch ${HEARTBEAT_FILE}

# Uses Modulus function to do only periodic calls to the Unloader
# The Unloader checks for completed output files to forward to user
            if [ $(( ${CMD_INPUT_POINTER} % ${MAX_UNLOADER_DELAY})) ==  0 ]
            then
                wfe_msg_unloader &
            fi

            NUM_PROCS=`ps -u root | grep wfe_voice_ | wc -l`

            if [ ${NUM_PROCS} -lt ${MAX_NUM_PROCS} ]
            then
                my_job "${CMD_INPUT}" &
            else
                sleep 1
            fi
        else
            echo at ${LINENO} BOGUS COMMAND - SKIPPED >> ${WFE_2_ROP};
        fi       

#   done < ${LOCAL_CSV}
    done # Test Version for setting breakpoints


done # while TRUE

# 17  
Old 01-29-2015
Append a:
Code:
sleep 3

before the final done, as already mentioned a few times.

As of now, during each loop you spawn 'check-jobs' to the background, ignoring wether or not they even had finished, while starting new jobs in the same loop..
As already said, even while [ true ];do sleep 1 & ; done can bring a machine to its knees, imagine what background jobs do, that actualy do something...

hth

PS:
You might want to have a look at: [BASH] Script to manage background scripts (running, finished, exit code)
The mods were kind and provided several working scripts.
And on the 3rd page its (currently) last post shows my solution using TUI, which runs multiple scripts in background, limiting/set an amount of allowed scripts and reports their exit status.
# 18  
Old 01-29-2015
BASH Execution Delay / Speedup

Yes, Don, it helps a LOT.

Now, this is the code that checks for existing processes (the job name is "my_job") and only sleeps if the number hits MAX_NUM_PROCS (which
has been set as high as 400, but is now set at 2)

Does this work?

Code:
           NUM_PROCS=`ps -u root | grep my_job | wc -l`

            if [ ${NUM_PROCS} -lt ${MAX_NUM_PROCS} ]
            then
                my_job "${CMD_INPUT}" &
            else
                sleep 1
            fi
        else

---------- Post updated at 01:28 PM ---------- Previous update was at 01:26 PM ----------

Someone asked what "heartbeat" did, since I only "touch" it. I check that with a background watchdog process that just sees if it's been touched recently, and if not, assumes this process isn't well, kills it if it exists, and replaces it.

Make sense?

Last edited by Don Cragun; 01-29-2015 at 03:45 PM.. Reason: Add CODE tags.
# 19  
Old 01-29-2015
Was me, and i removed it because you already answered it but i had overseen.
Though, its not clear to me how it would identify which process to kill, as the file is just touched and you spawned multiple jobs without 'saving' their corresponding pid.
(edit: Unless that is handled in that other script.)

Ok, so you want to check if there are already started enough process, issue is, MAX_NUM_PROCS is not set anywhere in the code you posted.
This User Gave Thanks to sea For This Post:
# 20  
Old 01-29-2015
BASH Execution Delay / Speedup

Okay -- I've seen a few references to process-limiting methods. What is "bctl" and how would I use it in my situation? Is my approach of using "ps" and gripping for the function name unusable? Or perhaps something like "bctl" just does it better?

Thanks again!

---------- Post updated at 01:59 PM ---------- Previous update was at 01:57 PM ----------

sea -- thanks again!

The "my_job" function keeps the PID in its own "heartbeat" file, so when the garbage collection routine comes around, it sees how long that file's been untouched, and then tries to kill "old" jobs using the contained PID.

Again, if there's a better way to do this, please, feel free to straighten me out!
# 21  
Old 01-29-2015
bctl is a tool written in C (#+?) by DGPickett.
He shared the code, so you can compile it on your system.

tui-psm is tool written in bash by me. (psm stands for paralell script manager)
I shared the code and made it part of its own dependency (TUI).

There was a discussion wether to use kill, ps or /proc way to identify the processes.
For me, ps worked the best.
So, give the others a try, if they work better for you - switch, otherwise keep using ps.

hth
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Why execution is different from orginal bash registry$database?

Hi all, here's my script #!/bin/ksh if then export DB_CREATE_PATH=`pwd` fi echo echo "********************--Menu--*****************************" echo "*** " echo "*** 1. Pre-Upgrade Steps "... (3 Replies)
Discussion started by: jediwannabe
3 Replies

2. Shell Programming and Scripting

How to simulate an execution queue with bash?

I'm running cygwin bash on windows 7 and I'm have some bat files that perform large builds and take a long time and a lot of memory. Therefor, I don't want to builds executing simultaneously (too much memory). How can I implement a queue so I can queue up multiple builds and only execute one... (2 Replies)
Discussion started by: siegfried
2 Replies

3. Shell Programming and Scripting

execution of a string being echoed in bash

hi all, I am trying to do a loop on a series of plotting function shown below: colorlist=(blue red green); n=0; for k in $xy; do psbasemap $range -JM$scale -B10g5 -X1 -Y1 -P -K > $outfile pscoast $range -JM$scale -B10g5 -D$res -P -W$lwidth -G$fill -O -K >> $outfile echo... (1 Reply)
Discussion started by: ida1215
1 Replies

4. Shell Programming and Scripting

Execution problems with BASH Shell Script

Hi I need help with my coding , first time I'm working with bash . What i must do is check if there is 3 .txt files if there is not 3 of them i must give an error code , if al three is there i must first arrange them in alphabetical order and then take the last word in al 3 of the .txt files... (1 Reply)
Discussion started by: linux newb
1 Replies

5. Shell Programming and Scripting

Execution Problems with bash script

Hello, can someone please help me to fix this script, I have a 2 files, one file has hostname information and second file has console information of the hosts in each line, I have written a script which actually reads each line in hostname file and should grep in the console file and paste the... (8 Replies)
Discussion started by: bobby320
8 Replies

6. Shell Programming and Scripting

Splitting file needs speedup

I've got a large file (2-4 gigs), made up of 4 columns. I'd like to split the file into two, based on the 2nd column value being even or odd. The following script does the job, but runs incredibly slow--I'm not sure it will complete this week. There must be a clever way to do this with just... (2 Replies)
Discussion started by: I.P. Freeley
2 Replies

7. Shell Programming and Scripting

Optimize and Speedup the script

Hi All, There is a script (test.sh) which is taking more CPU usage. I am attaching the script in this thread. Could anybody please help me out to optimize the script in a better way. Thanks, Gobinath (6 Replies)
Discussion started by: ntgobinath
6 Replies

8. Shell Programming and Scripting

execution time / runtime -- bash script please help!

Hello, I'm running a bash script and I'd like to get more accurate a runtime information then now. So far I've been using this method: STARTM=`date -u "+%s"` ......... *script function.... ......... STOPM=`date -u "+%s"` RUNTIMEM=`expr $STOPM - $STARTM` if (($RUNTIMEM>59)); then... (6 Replies)
Discussion started by: TehOne
6 Replies

9. Shell Programming and Scripting

Calculating delay time - bash

Hi, I am having the following problem. test > hourOfDay=06 ; delayTime=$(((9-$hourOfDay)*60)) ; echo $delayTime 180 test > hourOfDay=07 ; delayTime=$(((9-$hourOfDay)*60)) ; echo $delayTime 120 test > hourOfDay=08 ; delayTime=$(((9-$hourOfDay)*60)) ; echo $delayTime bash: (9-08: value... (5 Replies)
Discussion started by: jbsimon000
5 Replies

10. Shell Programming and Scripting

bash - delay expansion of variable

Hello - I have a bash script which does some logging, and I'd like to include the line number of the echo statement that pipes into $LOGGER: MYPID=$$ MYNAME=`basename $0` LOGGER="/usr/bin/logger -t $MYNAME($LINENO) -p daemon.error" ... echo 'this is an entry into the log file' | $LOGGER ... (3 Replies)
Discussion started by: scandora
3 Replies
Login or Register to Ask a Question