Retry every ten seconds while lockfile present


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Retry every ten seconds while lockfile present
# 1  
Old 01-04-2010
Retry every ten seconds while lockfile present

Hi,

I have written below check lockfile script but need some tweaking on it.

If there is a lockfile from present, I need the script to retry every 10 seconds to see if the lockfile is still there. After 120 seconds it should send an email.

In my current version, if the script encounters the lockfile it will start counting to 120 seconds anyway and send an email, even if the lockfile is not present anymore.

Can someone give me some directions on how to tweak the second while ; do ; done loop?

Code:
LOCK1=$TRANSOUT/lock1.txt             # name and place of first lockfile
LOCK2=$TRANSOUT/lock2.txt             # name and place of second lockfile
HEARTBEATFILE=$TRANSOUT/heartbeat       # name and place of heartbeat file for
while :                                          # start an infinite loop here
 do
    LOGFILE=`date +$LOGDIR/COPY_LOG_%Y%m%d.log` # name and place of our daily logfile
    HEARTBEAT=`date +%Y%m%d%H%M`                 # heartbeat timestamp
    TIMESTAMP="`date +%H:%M:%S`"                 # timestamp to add to our log
    
    typeset -i count                             # declare the count variable to be local to the function it is defined in
    rm -f $LOCK1                              # remove our lockfile if we didn't exit cleanly              
    
     while [ -f $LOCK2 ]                      # while Others are busy
      do
    
          unset TIMESTAMP                        # clear timestamp so we will see the right time for the next log entry
          sleep 10                               # wait for 10 seconds and retry
          TIMESTAMP="`date +%H:%M:%S`"           # new timestamp to add to our log
          count=$((count + 1))                   # increment counter by one
    
           if [ $count -eq 12 ] ; then           # if we waited 120 seconds
           
            echo "Lockfile present" | mailx -m -s "WARNING LOCKFILE PRESENT, WAITED 120 SECONDS" john@doe.com # email the culprit
            echo "$LOCK2 PRESENT, WAITED 120 SECONDS" | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add message to logfile
           
            count=0                              # restart the counter
           fi
      done                                       # if no second lockfile exists (anymore)
                             
     echo $$ > $LOCK1                         # create our own lockfile with our own process ID so that others know we are working
                                                 
     cd /interface          # go to initial directory where files are
   
     ls -1 * > move.lst                      # (ONE not L!) create a list of file names in single column format in directory
                                                 # and put them into one file called move.lst
                                
      if [ -s move.lst ] ; then              # check to see if move.lst is not empty so we can continue

       while read N                              # while we read lines in move.lst
        do
         case
           *) echo >>run_move.lst mv $N $TRANSOUT/ ;;                # add move command and put this in run_move.lst
          esac
        done <move.lst

       chmod 777 run_move.lst                   # make run_move.lst executable
       run_move.lst                             # and execute it
                                                 
       rm move.lst                           # remove original list
       cat run_move.lst | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add our move list to a logfile
       rm run_move.lst                          # remove move list

      else                                       # if move.lst is empty in the first place
       echo "NO FILES TO MOVE" | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add a message to the logfile
       rm move.lst                           # remove original list
      fi

    COPYFILE="meert.log"
    cp $LOCK1 $LOGDIR/$COPYFILE               # put the current process id to the logdir
    echo $HEARTBEAT > $HEARTBEATFILE             # put heartbeat timestamp into heartbeat file
    rm -f $LOCK1                              # remove our lockfile so the next program can start
                                                 
    unset TIMESTAMP                              # clear timestamp so we will see the right time for the next log entry
    sleep 180                                    # wait another 180 seconds and start with the whole process again
    TIMESTAMP="`date +%H:%M:%S`"                 # new timestamp to add to our log
    echo "WAITED 180 SECONDS FOR NEXT MOVE" | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add a message to the logfile
    unset TIMESTAMP                              # clear timestamp so we will see the right time for the next log entry
    unset HEARTBEAT                              # clear heartbeat timestamp
    unset LOGFILE                                # clear logfile name so this program keeps logging to the most current logfile
 done

# 2  
Old 01-04-2010
Code:
count=0
while [ -f $LOCK2 ]                      # while Others are busy
do
  if [ $count -eq 12 ] ; then            # if we waited 120 seconds
    TIMESTAMP="$(date +%H:%M:%S)"        # new timestamp to add to our log

    echo "Lockfile present" | .......
    echo "$LOCK2 PRESENT, WAITED 120 SECONDS" | .......

    count=0
  fi
  sleep 10                               # wait for 10 seconds and retry
  count=$((count + 1))                   # increment counter by one
done

# 3  
Old 01-04-2010
Hello ,
Common atomic operation on filesystem is mv .
Code :
"while [ -f $LOCK2 ] " will fail , eventually , some day under heavy stress with concurrent access .

Correct locking should be like this :

Code:
while [ 1 ];do
   mv  $COMMON  $PRIVATE
   if [ $? -eq 0  ];then 
      # critical section here
      # do the work and   release the lock 
      mv $PRIVATE  $COMMON
   else 
       #  bad luck 
       sleep 120
    fi
done

This code will work on almost any possible combinations ( NFS , AFS , GFS , OCFS etc )

Last edited by Franklin52; 01-04-2010 at 07:17 AM.. Reason: Please use code tags!
# 4  
Old 01-04-2010
Quote:
Originally Posted by scottn
Code:
count=0
while [ -f $LOCK2 ]                      # while Others are busy
do
  if [ $count -eq 12 ] ; then            # if we waited 120 seconds
    TIMESTAMP="$(date +%H:%M:%S)"        # new timestamp to add to our log

    echo "Lockfile present" | .......
    echo "$LOCK2 PRESENT, WAITED 120 SECONDS" | .......

    count=0
  fi
  sleep 10                               # wait for 10 seconds and retry
  count=$((count + 1))                   # increment counter by one
done

Scott: thank you, this resolved my issue.

Quote:
Originally Posted by rrstone
Hello ,
Common atomic operation on filesystem is mv .
Code :
"while [ -f $LOCK2 ] " will fail , eventually , some day under heavy stress with concurrent access .

Correct locking should be like this :

Code:
while [ 1 ];do
   mv  $COMMON  $PRIVATE
   if [ $? -eq 0  ];then 
      # critical section here
      # do the work and   release the lock 
      mv $PRIVATE  $COMMON
   else 
       #  bad luck 
       sleep 120
    fi
done

This code will work on almost any possible combinations ( NFS , AFS , GFS , OCFS etc )
rrstone: I don't understand your reply.
# 5  
Old 01-04-2010
My claim is :
Check for file existence is open to race condition .

My code provides solution for this race condition .
# 6  
Old 01-04-2010
and this is provided as a module called as flock
# 7  
Old 01-04-2010
flock on NFS ?
You should be very careful with flock on NFS .
Bad things happen when you use different Unixes and rely on flock .
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Unrecovered read error No retry

We encountered these error 2 times(e.g. Solaris 10 with NetWorker installed) with in the month of August, but we couldn't pin point the root cause, it might be bad sector, bad cable or software incompatibility? Do you experience these issue or please share your understanding about this? Thanks... (0 Replies)
Discussion started by: B@S
0 Replies

2. UNIX for Advanced & Expert Users

Semaphore - lockfile/flock

Hi, I have a process which can run one instance at a time. Currently we have multiple scripts trying to kickoff this process. I wanted to implement the semaphore mechanism to achieve this. I was going through few examples. The below code seems to be reasonable solution. ... (5 Replies)
Discussion started by: tostay2003
5 Replies

3. Shell Programming and Scripting

If then else - Retry operation

I need to read a file line by line, then depending on the contents of each line, type in a code that will get written to an array. The problem I have is when I ask the user to confirm the input code, if it is wrong, how do i Return to ask again? Any thing I try increments the file to the next... (6 Replies)
Discussion started by: kcpoole
6 Replies

4. Shell Programming and Scripting

Bash Lockfile Command

Hi, I am new to this forum, could any one please help me to understand the LOCKFILE command with an example and what exactly it is used for and how it is used. Thanks Reshu289 (4 Replies)
Discussion started by: Reshu289
4 Replies

5. UNIX for Advanced & Expert Users

How to manipulate the conditions between every retry in wget?

Hi , When i hit the URL using WGET command ,it is retrying according to the number of retry we mentioned along with Wget command. my expectation : 1) If 1st try is failed and iam retrying again before 2nd retry i have to check for "xxxxxxx" entry in the log file. 2) If "XXXXXXX" entry is... (4 Replies)
Discussion started by: vinothsekark
4 Replies

6. Shell Programming and Scripting

execute the shell script per ten seconds

hi, everyone. My want to execute the shell script below per 10 seconds PID=`pgrep java` if then /home/java/java fi crontab wouldn't help me. some one can give me suggestions?thanks ---------- Post updated at 07:29 AM ---------- Previous update was at 07:26 AM ---------- ... (6 Replies)
Discussion started by: AKB48
6 Replies

7. Shell Programming and Scripting

Shell Script to Retry and Exit

ok, so I'm trying to add a function to my local script that runs a command on a remote host. The reason why this is needed is that, there are other scripts that run different commands on the same remote host. so the problem is that many times there are multiple scripts being run on the remote... (1 Reply)
Discussion started by: SkySmart
1 Replies

8. Shell Programming and Scripting

Retry upon FTP failure

I am using the following code in a C Shell script to transfer files to a remote server: ftp -n logxx.xxxx.xxx.xxx.com <<DO_FTP1 quote user $user_name quote pass $password ascii put $js_file_name bin put $FinalZipFile quit DO_FTP1 This code works great except on those rare occasions... (8 Replies)
Discussion started by: phudgens
8 Replies

9. Shell Programming and Scripting

retry process in ftp

hi #!/bin/bash SERVER=10.89.40.35 USER=xyz PASSWD=xyz ftp -in $SERVER<<EOF user $USER $PASSWD mkdir PPL cd /path of remote dir lcd /path of local dir hash bin put <file name> bye <<EOF The above ftp script i have to schedule in crontab at a particular instance of time run daily.... (2 Replies)
Discussion started by: rookie250
2 Replies

10. UNIX for Advanced & Expert Users

Enomem in Journal Retry Error

Hi, Does anyone seen this error before.. kernel: ENOMEM in journal_alloc_journal_head, retrying. I encounter this problem on IBM eServers where when the above error appears usually the machine is dead or hanged. Unless a hard reboot is been done. Is this something have to do with the memory... (1 Reply)
Discussion started by: killerserv
1 Replies
Login or Register to Ask a Question