Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Restarting jobs after failures Post 302194290 by era on Monday 12th of May 2008 03:24:54 PM
Old 05-12-2008
I think abcd122 just meant to keep track of which files you have finished, so you have a way of knowing which files still need processing. That's a start, but doesn't really solve the problem, as such.

I usually run GNU make for stuff that might take long and might bomb out; it will delete the output form a botched run (if you specify .DELETE_ON_ERROR) and can keep track of what's been done and what still needs to be done. (Still missing from the picture is some sort of concurrency control, to be able to see that a particular pending result is already running on another host.) It takes a bit of getting used to but it has paid itself back handsomely a number of times.

That's a more radical refactoring than you were thinking of, I'm sure, but I'm offering it nevertheless; if interruptions are a scenario you need to take seriously, there's no way you can code a for loop which simply "knows" when it's done.

But of course, if the presence of a file is enough, then the really simple thing will work:

Code:
for f in file1 file2 file3 file4; do
  if [ -e $f.out ]
  then
    echo $f.out already exists -- not rerunning
  else if  long and winding and painstaking command to calculate ruptures in space time fabric <$f>$f.tmp
   then
    # only commit when it really finished, notice this is also conditional on exit code
    mv $f.tmp $f.out
  else
    echo "oh dear, $f failed (exit code $?), leaving output in $f.tmp" >&2
  fi
done

 

10 More Discussions You Might Find Interesting

1. Solaris

Core dump failures

Does anyone have a list of error codes when core dumps fail? What is error 4? I also have another box that does error-2 occasionally. if anyone has a list of these error codes, it would be appreciated, thanks! I have the error below: NOTICE: core_log: ns-admin core dump failed,... (2 Replies)
Discussion started by: BG_JrAdmin
2 Replies

2. Shell Programming and Scripting

background jobs exit status and limit the number of jobs to run

i need to execute 5 jobs at a time in background and need to get the exit status of all the jobs i wrote small script below , i'm not sure this is right way to do it.any ideas please help. $cat run_job.ksh #!/usr/bin/ksh #################################### typeset -u SCHEMA_NAME=$1 ... (1 Reply)
Discussion started by: GrepMe
1 Replies

3. HP-UX

Communication Failures

HI ALL, I have been trying to install a particular software using remote linux server. some thing like this: rsh <host ID> /usr/sbin/swinstall -x autoreboot=true -s /tmp/<software> <Product name>. The problem is whenever I try to install the product through a shell script the installation... (1 Reply)
Discussion started by: barun agarwal
1 Replies

4. Shell Programming and Scripting

Display login failures

How to display failled login in a file. i.e when there occurs a login failure,the login failed date and time should be printed in that file.. (0 Replies)
Discussion started by: aravind007
0 Replies

5. Solaris

Prediction of failures

Any diagnostic tool to do predictive check on all the SUN hard disks before it fails, as a preventive measure? Meaning, is there any tool that can really check for hdd which are failing/or "will fail soon" for Sun servers? (12 Replies)
Discussion started by: incredible
12 Replies

6. Solaris

Solaris 10 svcs failures

upon rebooting the solaris 10 system, all the services went offilne or uninitialised. If I break the SVM mirror and reboot the system with the raw device, all services are up. Once I recreate a fresh mirror(metadevices) and reboot, it goes offline again. Needed to do svcadm clear <service> to bring... (16 Replies)
Discussion started by: incredible
16 Replies

7. Shell Programming and Scripting

Capture linking failures

Hi all, I have a script file that has numerous linking commands (ln -s) and currently there is no checking to see if the linking is successful or not and I need to implement something that checks if any of the linking failed and report a failure. The method I can think of is a small function... (3 Replies)
Discussion started by: zmfcat1
3 Replies

8. Shell Programming and Scripting

waiting on jobs in bash, allowing limited parallel jobs at one time, and then for all to finish

Hello, I am running GNU bash, version 3.2.39(1)-release (x86_64-pc-linux-gnu). I have a specific question pertaining to waiting on jobs run in sub-shells, based on the max number of parallel processes I want to allow, and then wait... (1 Reply)
Discussion started by: srao
1 Replies

9. Solaris

11.0 to 11.2 update failures

Attempting to update an 11.0 server with many non-global zones installed. pkg publisher is pkg.oracle.com/solaris/support. FMRI = pkg://solaris/entire@0.5.11,5.11-0.175.1.15.0.4.0:20131230T203500Z When we run pkg update --accept the server contacts oracle, checks packages, finds about 700... (4 Replies)
Discussion started by: CptCarrot
4 Replies

10. Shell Programming and Scripting

Shell script to run multiple jobs and it's dependent jobs

I have multiple jobs and each job dependent on other job. Each Job generates a log and If job completed successfully log file end's with JOB ENDED SUCCESSFULLY message and if it failed then it will end with JOB ENDED with FAILURE. I need an help how to start. Attaching the JOB dependency... (3 Replies)
Discussion started by: santoshkumarkal
3 Replies
diff3(1)						      General Commands Manual							  diff3(1)

Name
       diff3 - 3-way differential file comparison

Syntax
       diff3 [-ex3] file1 file2 file3

Description
       The command compares three versions of a file, and publishes the ranges of text that disagree, flagged with the following codes:

	  ====	      all three files differ

	  ====1       file1 is different

	  ====2       file2 is different

	  ====3       file3 is different

       The type of change needed to convert a given range of a given file to some other is indicated in one of these ways:

	  f : n1 a    Text is to be appended after line number n1 in file f, where f = 1, 2, or 3.

	  f : n1 , n2 c
		      Text is to be changed in the range line n1 to line n2.  If n1 = n2, the range may be abbreviated to n1.

       The original contents of the range follows immediately after a c indication.  When the contents of two files are identical, the contents of
       the lower-numbered file is suppressed.

Options
       -3   Produces an editor script containing the changes between file1 and file2 that are to be incorporated into file3.

       -e	   Produces an editor script containing the changes between file2 and file3 that are to be incorporated into file1.

       -x	   Produces an editor script containing the changes among all three files.

Examples
       Under the -e option, publishes a script for the editor that incorporates into file1 all changes between file2 and  file3  -  that  is,  the
       changes	that would normally be flagged ==== and ====3.	Option -x (-3) produces a script to incorporate only changes flagged ==== (====3).
       The following command applies the resulting script to `file1':
       (cat script; echo '1,$p') | ed - file1

Restrictions
       Text lines that consist of a single `.'	defeat -e.

Files
       /tmp/d3?????
       /usr/lib/diff3

See Also
       cmp(1), comm(1), diff(1), dffmk(1), join(1), sccsdiff(1), uniq(1)

																	  diff3(1)
All times are GMT -4. The time now is 01:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy