02-17-2009
hi,
if any one of the scripts fails in SET1.how can i restart again.
when i restart, only i want to run the failed job.
if a job in set1 failed i want to update the status as FAILED in database.
after resolving the problem.
when i update the database status as RUN for the corresponding job.
corresponding script has to run automatically.
how can i do that.please advice.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hello,
I wish to run parallel process forked from one script.
Currently I doing is submit them in background.
For example:
---------------------------------------------------------------
#!/usr/bin/ksh
process1 &
process2 &
process3 &
.....
.....
#here I check for completion of... (4 Replies)
Discussion started by: RishiPahuja
4 Replies
2. AIX
Hi,
Is there any way to run parallel jobs using make command?
I am using non-GNU make utility on AIX 4.3.
I want to run 2 jobs symultaneously using make utility.
Thanks.
Suman (0 Replies)
Discussion started by: suman_jakkula
0 Replies
3. High Performance Computing
Hi All,
I am working on a project where I need to execute set of arguments (around 500) on a Simulator. If I execute this on one linux (RedHat 8.0) machine it will approximately takes 2~3 days. Since I am having more linux machines am thinking of executing these on different machines in... (7 Replies)
Discussion started by: 123an
7 Replies
4. Shell Programming and Scripting
Hi All,
At present I am using a UNIX Script which is running a set of JOBS. These JOBS are to be repeated for 20 times, means the same set of JOBS are repeated by just passing different arguments (From 1 to 20).
Is there any way by which I can execute them parallel?
At present its all... (4 Replies)
Discussion started by: Prashantckc
4 Replies
5. Shell Programming and Scripting
how can i process jobs parallel with conditions below.
Script1.ksh
Script2.ksh
Script3.ksh
Script4.ksh
Script5.ksh
Script6.ksh
Script7.ksh
Script8.ksh
Script9.ksh
Script10.ksh
After successful completion of Script1.ksh I need to run Script7.ksh.
After successful... (4 Replies)
Discussion started by: ford2020
4 Replies
6. Shell Programming and Scripting
Hi everybody,
In a csh script, i need to run 4 time the same prog with different parameters. What i want is to run them in parallel. for this i use the command
toto1.sh & toto2.sh & toto3.sh & toto4.sh
For this I have no problem. In fact, I need to wait until all the programs are over to... (2 Replies)
Discussion started by: Moumou
2 Replies
7. Shell Programming and Scripting
I am haveing 2 scripts, 1st script calls 2ed script for each parameter.(parameters are kept in a different txt file)
1st script
for x in `cat Export_Tables_List.txt`
do
sh Exp_Table.sh $x &
done
echo -e "1) following tables are successfully exported : \n" > temp
cat... (1 Reply)
Discussion started by: sbmk_design
1 Replies
8. Programming
Since there've been a few requests for a method to execute commands on multiple CPUs (logical or physical), with various levels of shell-, make-, or Perl-based solutions, ranging from well-done to well-meant, and mostly specific to a certain problem, I've started to write a C-based solution... (4 Replies)
Discussion started by: pludi
4 Replies
9. UNIX for Advanced & Expert Users
Hi All,
We have a table that has to store around 80-100 million records. The table is partitioned by a column called Market Code. There are 30 partitions each corresponding to one MRKT_CD.
The source of this table is a join between 3-4 other tables. We are loading this table through SQLPLUS... (2 Replies)
Discussion started by: jerome_rajan
2 Replies
10. Shell Programming and Scripting
I have few very huge files ~ 2 Billion rows of 130 column(CDR data) in a folder, I have written shell script need to read on each file in a folder and will create a new files based on some logic.
But problem is it's taking time to create a new file due to the size , So i dont want to corrupt... (6 Replies)
Discussion started by: rspwilliam
6 Replies
LEARN ABOUT CENTOS
sge_shepherd
SGE_SHEPHERD(8) Sun Grid Engine Administrative Commands SGE_SHEPHERD(8)
NAME
sge_shepherd - Sun Grid Engine single job controlling agent
SYNOPSIS
sge_shepherd
DESCRIPTION
sge_shepherd provides the parent process functionality for a single Sun Grid Engine job. The parent functionality is necessary on UNIX
systems to retrieve resource usage information (see getrusage(2)) after a job has finished. In addition, the sge_shepherd forwards signals
to the job, such as the signals for suspension, enabling, termination and the Sun Grid Engine checkpointing signal (see sge_ckpt(1) for
details).
The sge_shepherd receives information about the job to be started from the sge_execd(8). During the execution of the job it actually
starts up to 5 child processes. First a prolog script is run if this feature is enabled by the prolog parameter in the cluster configura-
tion. (See sge_conf(5).) Next a parallel environment startup procedure is run if the job is a parallel job. (See sge_pe(5) for more infor-
mation.) After that, the job itself is run, followed by a parallel environment shutdown procedure for parallel jobs, and finally an epilog
script if requested by the epilog parameter in the cluster configuration. The prolog and epilog scripts as well as the parallel environment
startup and shutdown procedures are to be provided by the Sun Grid Engine administrator and are intended for site-specific actions to be
taken before and after execution of the actual user job.
After the job has finished and the epilog script is processed, sge_shepherd retrieves resource usage statistics about the job, places them
in a job specific subdirectory of the sge_execd(8) spool directory for reporting through sge_execd(8) and finishes.
sge_shepherd also places an exit status file in the spool directory. This exit status can be viewed with qacct -j JobId (see qacct(1)); it
is not the exit status of sge_shepherd itself but of one of the methods executed by sge_shepherd. This exit status can have several mean-
ings, depending on in which method an error occurred (if any). The possible methods are: prolog, parallel start, job, parallel stop, epi-
log, suspend, restart, terminate, clean, migrate, and checkpoint.
The following exit values are returned:
0 All methods: Operation was executed successfully.
99 Job script, prolog and epilog: When FORBID_RESCHEDULE is not set in the configuration (see sge_conf(5)), the job gets re-queued.
Otherwise see "Other".
100 Job script, prolog and epilog: When FORBID_APPERROR is not set in the configuration (see sge_conf(5)), the job gets re-queued. Oth-
erwise see "Other".
Other Job script: This is the exit status of the job itself. No action is taken upon this exit status because the meaning of this exit
status is not known.
Prolog, epilog and parallel start: The queue is set to error state and the job is re-queued.
Parallel stop: The queue is set to error state, but the job is not re-queued. It is assumed that the job itself ran successfully and
only the clean up script failed.
Suspend, restart, terminate, clean, and migrate: Always successful.
Checkpoint: Success, except for kernel checkpointing: checkpoint was not successful, did not happen (but migration will happen by
Sun Grid Engine).
RESTRICTIONS
sge_shepherd should not be invoked manually, but only by sge_execd(8).
FILES
sgepasswd contains a list of user names and their corresponding encrypted passwords. If available, the password file will be
used by sge_shepherd. To change the contents of this file please use the sgepasswd command. It is not advised to change that file manu-
ally.
<execd_spool>/job_dir/<job_id> job specific directory
SEE ALSO
sge_intro(1), sge_conf(5), sge_execd(8).
COPYRIGHT
See sge_intro(1) for a full statement of rights and permissions.
SGE 6.2u5 $Date$ SGE_SHEPHERD(8)