SGE_REQUEST(5) Sun Grid Engine File Formats SGE_REQUEST(5)NAME
sge_request - Sun Grid Engine default request definition file format
DESCRIPTION
sge_request reflects the format of the files to define default request profiles. If available, default request files are read and processed
during job submission before any submit options embedded in the job script and before any options in the qsub(1) or qsh(1) command-line are
considered. Thus, the command-line and embedded script options may overwrite the settings in the default request files (see qsub(1) or
qsh(1) for details).
There is a cluster global, a user private and a working directory local default request definition file. The working directory local
default request file has the highest precedence and is followed by the user private and then the cluster global default request file.
Note, that the -clear option to qsub(1) or qsh(1) can be used to discard any previous settings at any time in a default request file, in
the embedded script flags or in a qsub(1) or qsh(1) command-line option.
The format of the default request definition files is:
o The default request files may contain an arbitrary number of lines. Blank lines and lines with a '#' sign in the first column are
skipped.
o Each line not to be skipped may contain any qsub(1) option as described in the Sun Grid Engine Reference Manual. More than one option
per line is allowed. The batch script file and argument options to the batch script are not considered as qsub(1) options and thus are
not allowed in a default request file.
EXAMPLES
The following is a simple example of a default request definition file:
=====================================================
# Default Requests File
# request group to be sun4 and a CPU-time of 5hr
-l arch=sun4,s_cpu=5:0:0
# don't restart the job in case of system crashes
-r n
=====================================================
Having defined a default request definition file like this and submitting a job as follows:
qsub test.sh
would have precisely the same effect as if the job was submitted with:
qsub -l arch=sun4,s_cpu=5:0:0 -r n test.sh
FILES
<sge_root>/<cell>/common/sge_request
global defaults file
$HOME/.sge_request user private defaults file
$cwd/.sge_request cwd directory defaults file
SEE ALSO sge_intro(1), qsh(1), qsub(1), Sun Grid Engine Installation and Administration Guide
COPYRIGHT
See sge_intro(1) for a full statement of rights and permissions.
SGE 6.2u5 $Date$ SGE_REQUEST(5)
Check Out this Related Man Page
CHECKPOINT(5) Sun Grid Engine File Formats CHECKPOINT(5)NAME
checkpoint - Sun Grid Engine checkpointing environment configuration file format
DESCRIPTION
Checkpointing is a facility to save the complete status of an executing program or job and to restore and restart from this so called
checkpoint at a later point of time if the original program or job was halted, e.g. through a system crash.
Sun Grid Engine provides various levels of checkpointing support (see sge_ckpt(1)). The checkpointing environment described here is a
means to configure the different types of checkpointing in use for your Sun Grid Engine cluster or parts thereof. For that purpose you can
define the operations which have to be executed in initiating a checkpoint generation, a migration of a checkpoint to another host or a
restart of a checkpointed application as well as the list of queues which are eligible for a checkpointing method.
Supporting different operating systems may easily force Sun Grid Engine to introduce operating system dependencies for the configuration of
the checkpointing configuration file and updates of the supported operating system versions may lead to frequently changing implementation
details. Please refer to the <sge_root>/ckpt directory for more information.
Please use the -ackpt, -dckpt, -mckpt or -sckpt options to the qconf(1) command to manipulate checkpointing environments from the command-
line or use the corresponding qmon(1) dialogue for X-Windows based interactive configuration.
Note, Sun Grid Engine allows backslashes () be used to escape newline (
ewline) characters. The backslash and the newline are replaced
with a space (" ") character before any interpretation.
FORMAT
The format of a checkpoint file is defined as follows:
ckpt_name
The name of the checkpointing environment as defined for ckpt_name in sge_types(1). To be used in the qsub(1)-ckpt switch or for the
qconf(1) options mentioned above.
interface
The type of checkpointing to be used. Currently, the following types are valid:
hibernator
The Hibernator kernel level checkpointing is interfaced.
cpr The SGI kernel level checkpointing is used.
cray-ckpt
The Cray kernel level checkpointing is assumed.
transparent
Sun Grid Engine assumes that the jobs submitted with reference to this checkpointing interface use a checkpointing library such as
provided by the public domain package Condor.
userdefined
Sun Grid Engine assumes that the jobs submitted with reference to this checkpointing interface perform their private checkpointing
method.
application-level
Uses all of the interface commands configured in the checkpointing object like in the case of one of the kernel level checkpointing
interfaces (cpr, cray-ckpt, etc.) except for the restart_command (see below), which is not used (even if it is configured) but the
job script is invoked in case of a restart instead.
ckpt_command
A command-line type command string to be executed by Sun Grid Engine in order to initiate a checkpoint.
migr_command
A command-line type command string to be executed by Sun Grid Engine during a migration of a checkpointing job from one host to another.
restart_command
A command-line type command string to be executed by Sun Grid Engine when restarting a previously checkpointed application.
clean_command
A command-line type command string to be executed by Sun Grid Engine in order to cleanup after a checkpointed application has finished.
ckpt_dir
A file system location to which checkpoints of potentially considerable size should be stored.
ckpt_signal
A Unix signal to be sent to a job by Sun Grid Engine to initiate a checkpoint generation. The value for this field can either be a symbolic
name from the list produced by the -l option of the kill(1) command or an integer number which must be a valid signal on the systems used
for checkpointing.
when
The points of time when checkpoints are expected to be generated. Valid values for this parameter are composed by the letters s, m, x and
r and any combinations thereof without any separating character in between. The same letters are allowed for the -c option of the qsub(1)
command which will overwrite the definitions in the used checkpointing environment. The meaning of the letters is defined as follows:
s A job is checkpointed, aborted and if possible migrated if the corresponding sge_execd(8) is shut down on the job's machine.
m Checkpoints are generated periodically at the min_cpu_interval interval defined by the queue (see queue_conf(5)) in which a job exe-
cutes.
x A job is checkpointed, aborted and if possible migrated as soon as the job gets suspended (manually as well as automatically).
r A job will be rescheduled (not checkpointed) when the host on which the job currently runs went into unknown state and the time
interval reschedule_unknown (see sge_conf(5)) defined in the global/local cluster configuration will be exceeded.
RESTRICTIONS
Note, that the functionality of any checkpointing, migration or restart procedures provided by default with the Sun Grid Engine distribu-
tion as well as the way how they are invoked in the ckpt_command, migr_command or restart_command parameters of any default checkpointing
environments should not be changed or otherwise the functionality remains the full responsibility of the administrator configuring the
checkpointing environment. Sun Grid Engine will just invoke these procedures and evaluate their exit status. If the procedures do not per-
form their tasks properly or are not invoked in a proper fashion, the checkpointing mechanism may behave unexpectedly, Sun Grid Engine has
no means to detect this.
SEE ALSO sge_intro(1), sge_ckpt(1), sge__types(1), qconf(1), qmod(1), qsub(1), sge_execd(8).
COPYRIGHT
See sge_intro(1) for a full statement of rights and permissions.
SGE 6.2u5 $Date$ CHECKPOINT(5)